working on an embedded database custom-built for nostr. think an embeddable strfry. It will probably be the fastest special-purpose database on the planet once it’s done.

I’ve built a custom in-memory note representation with O(1) and zero-copy access so that you can memory-map the data from inside lmdb directly into your data types without needing to serialize anything in and out.

if that didn’t make sense, the TLDR: shit is about to get real fast.

https://github.com/damus-io/nostrdb

Reply to this note

Please Login to reply.

Discussion

I feel like you wrote TLDR for me and I appreciate you.

Also, embedded has to be a good thing for performance. Right?

Nice fucking work Will!

Holy f.

πŸ€―πŸ‘†πŸ»

Let’s do it!

Awesome thanks Will

Will you support in memory mode so that I can use it in browser?

πŸ‘€

Thanks for the Tldr; I needed it.

πŸ˜‚πŸ˜‚ me too!

now how do we shard things across a cluster of pods... and still get a coherent subscription.

β€œFastest database on the planet” sound ambitious. Unrelated: I added CMake build and a Windows port for nostril couple months ago, submitted a pull request , but got no response (also adds the secp256k1 source, but that can be taken out) https://github.com/jb55/nostril/pull/30

my bad it must have slipped my inbox

Thanks. I’m trying to port nostrdb for windows using Cmake , seems very easy because the code is so simple. Fixed 2 build errors and test works only fails because I have to allocate memory in ndb_tag (declared as c pointer instead of that array zero size declaration)

I wish I understood this. How real, and how fast?

On the planet tho? Lol πŸ€” I ask you what you did today and I get this? Lol

> I've built a custom in-memory note representation...

That is quite the trick, much easier to code in C than in rust. I love this idea.

This will be faster than anything I code for some time. I'm serializing/deserializing and even though I'm using a fast library (speedy) it still requires an allocation and a copy at the minimum.

Thanks for pushing me in this direction, i was debating weather to go the simple route with sqlite, but since nostr is so restricted in its data format i feel like a custom DB with a dedicated mapping would work well.

How much faster will it be than sqlite? And how much value will it add to your most important projects? It's tempting to do cool stuff but ultimately it always takes (me at least) much more time and effort than anticipated

I think about 20-100 times faster. Not for the whole program, just for the data access part.

People love the idea of 'zero copy' but 'zero allocation' is where the real gains are. Memory allocators are complex and sometimes slowish things.

I've imagined a library that manages normally cloned data (mostly strings) by owning them in some global singleton object and handing out Arc references to them (actually has to be offsets, not actual references, if you want to persist and restart your code). Mainly so that I don't have to do .clone() and .to_owned() all the fricking time feeling like I'm slowing down the code, and so I can impl Copy on all the structs and as a side effect you could access them directly in LMDB without serialization. That system also has to persist to LMDB.

That's... impressively faster!

Who wants a video of Will trying to explain what he spent all day working on to me (normie)? Lol.

πŸ˜‚πŸ˜‚

Zaps help with encouragement haha

Something like this

LOL Omg I mean I wish I looked that young and he has much more hair and less beard but yes pretty accurate

The energy V… the energy. πŸ˜‚

It’s on point πŸ‘πŸ‘

Also maybe I should try this hairstyle πŸ€”

Hair extensions? πŸ’…

yes please πŸ’œ

Haha will try to do this later

Make it a weekly series!!!

Stay humble

Dope!

You know shit gettin' real when "big O" notation comes out.

Is this something all relays will use by default or what is the intended use case(s)?

πŸ‘€

How'd you decide to use C for this and Rust for shatter?

I already have a fast C content parser πŸ˜…. I’m using C code in Damus. Since it’s depending on lmdb, it made more sense because lmdb is a C lib as well.

This will also work in rust and notedeck, I’m planning on making a rust interface.

C is the way to code here. That JSON library, JSMN, is pretty slick, very low level, I used it before, but I use C++ now and β€˜JSON for modern C++’ . Still working on my CMake port , will post here when done https://github.com/damus-io/nostrdb/commit/32b0e572e418b81af515224cc7eb3b1e8598d906

*meant to say C is the way to go, but the β€œway to code β€œ works too… probably some Damus auto correct πŸ™ƒ

Don't worry, I'm not a Rust proselytizer lol, just curios and glad to see anyone writing low level efficient code these days. Excited to try out notedeck πŸ‘

The old-timers have been doing this for a long time. Every generation thinks they are such geniuses by reinventing the wheel that’s a previous generation already did.

😍 C