Almost all the recent gossip development has been by nostr:npub1hlq93jdtkfg29a8s7fqzzzh82q3pkc20rxucwt4geh6e56wk3y2qxdz5wg with nostr:npub10000003zmk89narqpczy4ff6rnuht2wu05na7kpnh3mak7z2tqzsv8vwqk doing design. Those guys deserve zaps.

I'm going through a "pinch point" where my life is really busy (finishing the house, 3 sets of taxes, too much firewood, truck/insurance issue) but I swear I will get back into gossip development (and other nostr-related things like NIPs and my personal relay idea) as soon as I can and finish the most important things (zaps, syncing settings on the network, media cache expire/etag/etc, inbox feed to include if tagged in content, reposting, finish search, DMs, some kind of key rotation/recovery scheme, way better at-mention UI, plus whatever new NIPs I've missed that are relevant and important)

What is your personal relay idea?

Reply to this note

Please Login to reply.

Discussion

A lightweight relay that works as a READ and WRITE relay, but only for messages related to me. So people can send me things (to my READ relay) but nobody can read them back. And anybody can follow what I WRITE. I looked into nostr-rs-relay and strfry and neither are easily tweakable to do this, but probably easier than starting from scratch.... yet, you know me, I want to start from scratch in rust, it would be fun. So I did, and got the very basic memory-mapping and lock-free concurrent indexing code working... ya know, the hard part, the fun part... and now I can't be bothered to finish the boring part.

My brother told me it took him more than a year of working on postgres to truly appreciate how many manhours have gone into it and how optimized it is for so many different kinds of situations... and that I will never be able to match it starting from scratch... and that avoiding theq SQL parsing and query planning isn't going to pay off when the data statistics change and my hardcoded solution becomes wrong... but he did understand the desire to have a much smaller thing without all the crap I'm not going to be using.

Sounds really great. What is memory-mapping?

Instead of using a database I'm storing events in a file. Memory mapping lets you access a file as if it was entirely in memory, and the OS pages it in and out for you automatically doing the reads and writes and stuff.

Each event just gets appended, which requires a write lock (one writer at a time) but readers can keep reading while writes are happening so it is less contentious than a traditional RWLock which blocks readers while writes happen.

The indexes are functionally similar to key-value btrees which map certain query tuples like (pubkeyhex, createdat, idhex) to an offset into the event file (the length of the event is encoded in the serialization so we just need the offset). The serialization is very minimal, just enough to have a length and to be linear. Each key in each index must be unique (or else we would lose events) which is why they end in the "idhex". We can scan a time slice (before/since) as they are grouped together on the btree, so I can iterate from (pubkey, before) to (pubkey, since+1) to get all that person's events within that time range. I'm intersecting multiple index results by hashing results and checking that map each time I run through the next index, discarding matches that come up missing (from the previous index). That should cover all possible queries, not in an ideal way, but in a simple and relatively fast way. strfry has more indices (all the common ones) and falls back to scanning a flatbuffer event sequence for matches. It also uses LMDB for the events rather than trying to memory map directly. I'm using sled for the indexing. I don't know if my indexing merging scheme is going to be faster or not. Probably slower in most cases, but faster then when strfry doesn't have a suitable index. That's my guess anyhow. My code is at least easy to reason about. I came up with mine before talking to Doug.

This sounds awesome

Pardon my ignorance, but it sounds like you are developing a database. What does your file system do better than sqlite3 or a time series db?

Just read the comment you're responding to.

The biggest challenges with designing a Nostr Relay isn’t the tech. Persistence strategy, architecture, platforms and language trade-offs are reasonably well understood.

The big issue is nips. How the clients implement them, when to build them, test cases. These all change with time. So it can feel like building on quicksand.

I do agree that it is chaotic. One thing that should stay true is NIPS should not change once merged unless to fix actual bugs. After that it should be a new NIP so that the NIPS can be unique capability identifiers.

Databases are either general purpose and therefore aren't optimized to the specific situation (e.g. sqlite3, which isn't even very concurrent) or else they are massive things that can optimize into a plethora of different situations (e.g. postgres). So you either suffer the "not optimal" or the "massive". By rolling your own, you suffer neither.

I'm doing it because it is fun and is a learning experience. I am well aware that I am reinventing the wheel. My brother is a core postgres developer. We have spoken on the phone and he understands what I am doing and says that it is similar to what postgres already does and why don't I use postgres? Because I wouldn't learn anything if I just used postgres. And it would be massive and a deployment nightmare.

I'm using sled (a key value database) as the underlying technology of my indices, so strictly speaking I am using a database and I'm not entirely reinventing. The remaining parts that I chose to write myself were not too difficult. The specific use cases of a personal nostr relay are rather constrained so I can optimize for them fairly quickly without it taking months of work.

This is impressive. Sounds really fun.