Instead of using a database I'm storing events in a file. Memory mapping lets you access a file as if it was entirely in memory, and the OS pages it in and out for you automatically doing the reads and writes and stuff.

Each event just gets appended, which requires a write lock (one writer at a time) but readers can keep reading while writes are happening so it is less contentious than a traditional RWLock which blocks readers while writes happen.

The indexes are functionally similar to key-value btrees which map certain query tuples like (pubkeyhex, createdat, idhex) to an offset into the event file (the length of the event is encoded in the serialization so we just need the offset). The serialization is very minimal, just enough to have a length and to be linear. Each key in each index must be unique (or else we would lose events) which is why they end in the "idhex". We can scan a time slice (before/since) as they are grouped together on the btree, so I can iterate from (pubkey, before) to (pubkey, since+1) to get all that person's events within that time range. I'm intersecting multiple index results by hashing results and checking that map each time I run through the next index, discarding matches that come up missing (from the previous index). That should cover all possible queries, not in an ideal way, but in a simple and relatively fast way. strfry has more indices (all the common ones) and falls back to scanning a flatbuffer event sequence for matches. It also uses LMDB for the events rather than trying to memory map directly. I'm using sled for the indexing. I don't know if my indexing merging scheme is going to be faster or not. Probably slower in most cases, but faster then when strfry doesn't have a suitable index. That's my guess anyhow. My code is at least easy to reason about. I came up with mine before talking to Doug.

Reply to this note

Please Login to reply.

Discussion

This sounds awesome

Pardon my ignorance, but it sounds like you are developing a database. What does your file system do better than sqlite3 or a time series db?

Just read the comment you're responding to.

The biggest challenges with designing a Nostr Relay isn’t the tech. Persistence strategy, architecture, platforms and language trade-offs are reasonably well understood.

The big issue is nips. How the clients implement them, when to build them, test cases. These all change with time. So it can feel like building on quicksand.

I do agree that it is chaotic. One thing that should stay true is NIPS should not change once merged unless to fix actual bugs. After that it should be a new NIP so that the NIPS can be unique capability identifiers.

Databases are either general purpose and therefore aren't optimized to the specific situation (e.g. sqlite3, which isn't even very concurrent) or else they are massive things that can optimize into a plethora of different situations (e.g. postgres). So you either suffer the "not optimal" or the "massive". By rolling your own, you suffer neither.

I'm doing it because it is fun and is a learning experience. I am well aware that I am reinventing the wheel. My brother is a core postgres developer. We have spoken on the phone and he understands what I am doing and says that it is similar to what postgres already does and why don't I use postgres? Because I wouldn't learn anything if I just used postgres. And it would be massive and a deployment nightmare.

I'm using sled (a key value database) as the underlying technology of my indices, so strictly speaking I am using a database and I'm not entirely reinventing. The remaining parts that I chose to write myself were not too difficult. The specific use cases of a personal nostr relay are rather constrained so I can optimize for them fairly quickly without it taking months of work.

This is impressive. Sounds really fun.