Nostr Web Client

So what we have here is a tradeoff. IFA (native graph db) avoids the O(log n) cost. Also, it scales well in the sense that query time doesn’t degrade with increasing total graph size, only with the subgraph queried. So we weigh that against the storage complexities introduced by IFA (translation layers and pointer rewrites) as you point out.

For apps that are traversal-dominant with more reads than writes, IFA wins those tradeoffs. And what we’re building is a knowledge graph curated with the assistance of your WoT. By its nature, it’s going to get bigger and more traversal-heavy as it gets more sophisticated. And the tasks that will really push its abilities are going to be read-heavy. (Imagine searching a knowledge graph to discover surprising but useful topological similarities between subgraph A and subgraph B. Also known as “making an analogy”.)

At least, that’s my current thought process. It’s worth keeping in mind that for write-heavy scenarios with short traversals, the tradeoffs may favor non-native graph databases. And there are non-native graph databases that are built on LMDB, like TuGraph and HelixDB.

semisol 2mo ago

But does this so called “log n” scaling even matter compared to the real world in real numbers?

A performance decrease of only 3x for a minimal part (reading the graph instead of analyzing it) at a scale of over 100000x is insanely good, compared to all those compactions that will be needed and the tremendous write workload at that scale...

Reply to this note

Please Login to reply.

Discussion

david 2mo ago

A billion users doesn’t mean n = a billion. If you’re looking at how people interact with each other, you’ll need a lot of indices for a lot of events.

semisol 2mo ago

It means 1 billion nodes though in an index.

david 2mo ago

If I want to discover instances where Alice reacts to a kind 1 note authored by Bob, I’ll need indices for content in addition to indices for users.

david 2mo ago

Or suppose I want a chart of total zap amount in sats as a function of how far down a note is on a thread. Now suppose I only want to count zaps sent by frens (mutual follows) of the thread initiator. Or suppose I want a query even more complicated than that.

We’ll need indices for all content events, all zaps, all nostr events. Trillions.

semisol 2mo ago

You take all zap events and use some fast OLAP DB to crunch through this data instantly instead of wasting your time with a graph DB

semisol 2mo ago

This is a simple merge-join on 2 indexes that is cheap