Nostr Database strategies:

1. Build indexes for every possible combination of filters: It's fast, but needs a ton (GBs) of memory and 10x your database size in disk. Works great for servers.

2. Don't index all tags: Some filters just won't work and it won't find stuff, but it is fast and memory/disk efficient. Works great for small apps that have known filter needs.

3. Delete events as soon as the app doesn't need them anymore: It's fast and memory/disk efficient, but search sucks and figuring out what can be deleted is a major hassle.

Did I miss anything?

Reply to this note

Please Login to reply.

Discussion

You can also replicate the data by streaming to an analytical db like click house or Google big query and avoid the indexes. It will just be a tad slower to update due to eventual consistency, but it is much faster for retrieval (and especially aggregations)

Depends on the usecase, of course

I heard about click house. Does it work well on Android?

I am trying to develop something generic, but for apps on Android.

No, it's a server side self hosted db.

It's the Yandex equivalent to Google's bigquery, and it's open source.

What's the usecase you're solving for? Have you thought of a caching strategy instead of a full db?

Yes, I am going back and forth on the idea of caching vs full db. Ideally the json events should stay local so make sure search of the past seen things works, but it is probably too big to keep on phones (10GB for my account) and it is definitely not fast. Every time it hits the disk, it slows down to a crawl.

Ideally, I'd like to keep the db with under 75MB of memory so that I can have 400MB for the rest of the app. But the indexes alone won't fit in 75MB of ram...

But "search" you mean a) searching for a keyword in all events or fetching, say, b) the user's event for some query?

For a I think the solution is to have a click house/bq and an API call. For b caching might actually do for 99% of users

Search means looking for users when you @ somebody (local db of profiles FTS indexed) + searching for something you have seen (FTS of content events) + searching for both in a relay with NIP-50.

In practice, the API call can be replaced by a relay, which we already do. But it is too slow or inconsistent over time (relay gets busy)

Using a noSQL database?