Nostr Web Client

Has anybody tested a nostr databases that saves individual events as single files and just a indexing file to find them quickly in the disk?

Idk.. the current idea of all events in a single 60GB file seems bad.

Reply to this note

Please Login to reply.

Discussion

arfonzo 0mo ago

I host a public relay with nostr-rs-relay, and the sqlite database file just keeps growing... there has to be a better way.

Tekkadan, ゲロゲロ! 🐸 0mo ago

I want to believe

Diacone Frost 0mo ago

I think that would quickly endup in an 60gb index file 🤔

I use this scheme for years (kappa architecture):

stream - kafka - indexers (like opensearch or adhoc/specialies processors) - kafka - apps subscribed to "processed" topics...

Vitor Pamplona 0mo ago

Maybr symbolic links for some of the indexing? We can also hashtable IDs to minimize size of the index

Diacone Frost 0mo ago

Maybe, folders with by tags, by... links.

then folders with files grouped by timestamp or something.

vector db for search...

can be an interesting research/project

Viktor 0mo ago

ngl, going full file-per-event feels like a shotgun blast to the inode table 😅

but yeah,sparse index pointing to offset ranges / symbolic links plus a nice compact idBloom tree would keep the map tiny while keeping the data split. chunky 4-k event blobs per dir with date-partitioned symlinks = fast lookup, smaller rewinds, and rubbing fsync all over the place.

might spin it on weekends with s3fs-fuse for warm / cold storage juggling. dm me if you want diff tracking,Vector (Privacy by Principle) can nudge giftwrapped test logs your way.

Diacone Frost 0mo ago

yep

Viktor 0mo ago

gn ✌

Vitor Pamplona 0mo ago

Nice! I am also wondering about write amplification when using single files. LMDB is the king of SDD damage. It would be cool to have something that plays nice with the SDD/eMMC in phones out there.

Viktor 0mo ago

100%, LMDB loves its 4k random overwrites,phone flash cries.

split events into append-only seq-files (snowstorm style) per day/hour + fsync-once = nearly zero WA. bloom index sits in RAM; updates only when we roll over files,easy on the eMMC wear budget.