I was hoping to use Sqlite3 and avoid JSON in the database itself. I don't plan to have messages be easily reconstructable, since this is mostly for storing and analyzing things, but I guess I could store raw messages as blobs for future use/relaying.

The big problem I'm having is how to deal with tags in nip01 events. I want them in separate tables for easier indexing/lookup but think join tables (e.g. for the many-to-many relationship between public keys and events) might be making the solution too complex.

I'm sure plenty of people have already whiteboarded this and am wondering what the general consensus is so far.

Reply to this note

Please Login to reply.

Discussion

I will say, for hardcore analytics queries you are going to push up against sqlite limitations quite quickly, as the amount of data you store increases. OLAP workloads (like I assume you plan on doing) don't do well on SQLite (I'm like SQLite's biggest fan). I ran a similar project for a while that did some analytics on a few hundred megs of data in a SQLite database but by the time you start doing big ol window queries etc it really breaks down. I'm talking second to second and a half best case stuff.

If your db grows to any size and/or you really start doing some gnarly queries then I am betting it will want a bigger dbms. You can always start with sqlite though and then port to postgres if your needs grow, sqlite tries to keep its syntax compatible where it can, especially with postgres.

Thanks. I may reconsider and start off with Postgres. I want to make a command line tool that anyone can download/run to get metadata and statistics about relays and interactions, hence wanting Sqlite for ease of use. It should give info that's more in-depth than a normal client would show, but not a monstrous Hadoop thing. I want it to be able to answer questions like, "Is such-and-such relay dropping my events specifically, or is it broken?" or "Which relays are filtering such-and-such words?" But I also want it to be able to reconstruct conversations, social graphs, and trends eventually to prove that Nostr is one big circlejerk or to find the pockets of people who don't just talk about zaps.

Sounds like Postgres is probably the way to go. It supports arrays semi-natively, too, which may help solve the nip01 tags problem a little more easily.

Please, and thank you. I lovelolovNostr for its decentralization thanks to rejection of a blockchain and it's hard to keep using it faithfully while every 2nd post is a blockchain shilling.