Let me see if I understand the demo in your video. You requested a list of Odell’s followers and you ranked them according to personalized PageRank using franzap’s pubkey as the source node (source = the specific node in the graph from which pPR calculation is initiated). The results came back very quickly. Is this because you precomputed franzap’s personalized PageRank? For me that takes about 15 seconds because the graph is very large, approximately 170,000 pubkeys extending out 7 or 8 hops and involving millions of follows. Or did you restrict the graph to be only Odell’s followers? Which would have fewer nodes, fewer edges and calculate much faster.

Reply to this note

Please Login to reply.

Discussion

Everything is done in real time

That doesn’t completely answer the question

we requested the rank of Fran his top 5 followers, with the ranks personalizedPagerank from the POV of Odell.

The algo runs on the whole graph of about 300k nodes, and it's done in real time. We've designed a very fast probabilistic version of personalized Pagerank. That's why is so fast

Ok, that makes sense. That’s impressive. The neo4j algo that I’m using is iterative. My default personalized PageRank params are 20 iterations, 0.85 damping factor. The neo4j graph data science library is pretty mature and does have an algorithm called eigenvector centrality which I haven’t played with. Could be faster, I don’t know.

I’ve made some measurements on my current implementation. Looks like neo4j takes 3 seconds to project the main graph onto a subgraph and only 2 seconds to run the PageRank algo. Interesting. The graph projection step is only necessary because I’m maintaining a bigger graph that is multipurpose, so those 3 seconds could be shaved off. The rest of the 15 seconds is probably my front end being slow.

Regardless, the linear algebra method is inherently faster than the iterative method, as we’ve discussed.

It’s nice to see stuff starting to get built in this space.