Nostr Web Client

When Derek mentioned there were some performance issues, I started tracking the messages returned by the server and saw it mentioned database error on a few REQs. Do you know if when you built and deployed it manually had you enabled read replica support in D1? It also seemed to be edited a bit to support some new filters/tags? I see you’ve since switched to a different implementation now, that’s fine. In Cloudflare, you can add “members” that can view a particular project, but don’t have admin privs. If you’d like, I’d love to check the logs of the worker and the original config. Happy to help any way. Maybe there’s still some use for Nosflare in the future. Cheers 🤙

Alex Gleason 2mo ago

The bottleneck seemed to be the D1 database. Replication was enabled. But how could sqlite ever possibly support such a high level of traffic? After a certain number of events were filled in it, it would slow down and eventually stop working (unless the D1 instance was nuked and restarted)

Reply to this note

Please Login to reply.

Discussion

Luxas 2mo ago

Yes, in terms of writes, it should have theoretically sustained about 50k concurrents or maybe 1000/sec events. That’s of course all theory. But with read replicas it should be able to sustain much more overall traffic with regards to REQs. I wonder if it could be rebuilt to use Cloudflare’s Hyperdrive with an external DB. Tried to benchmark it all, but obviously have no insight into what amount of traffic were actually hammering the relay from the web and apps. In all honesty, I think they might need a multi-relay setup and break up certain parts handled by different relays in order to support a level of traffic a centralized system like TikTok or IG would handle.

Garbage nsec 2mo ago

Under a single-writer ceiling how could there be any chance? This multi-relay/specialised-types would be a neat but at some point though you have to question why nostr in the first place. There's a limit to the number of nostr qualities you can sacrifice before you get to the point where you're in a worst of both worlds situation, what could have been a performant centralised architecture is held back by a bunch of nostr tech debt that isn't enabling anything nostr-like in the first place.

Kevin Åberg Kultalahti 2mo ago

A proper SQLite database shouldn't have issues with this amount of traffic. D1 isn't just SQLite though, it's SQLite + an API.

I would try using SQLite-in-DO instead of using D1. There's a good blog post about it here: https://blog.cloudflare.com/sqlite-in-durable-objects/

Secondly I would look at splitting up the database into multiple databases.

Luxas 2mo ago

Yes, am already using SQLite-in-DO for subscription management with the Websocket Hibernation API. As visitors are geo-steered towards a regional DO. But I haven’t tested trying to handle the data independently across all of the DOs instead of a single D1, but I think that would add some serious latency and make it unnecessarily complex. I did just rollout an update that adds Cloudflare Queues to all EVENT messages, which should help write throughout and handle burst up to 5000/sec.

hasky 2mo ago

Cloudflare , actually I don’t know about this until I read this. The idea was great during hibernate period like me in the winter ( sleep a lot) still be able to allowing JavaScript to run Those durable object ( DO) taking from memory .

How can I help ?

Kevin Åberg Kultalahti 2mo ago

I'd be curious to see how this runs on a bare metal VPS - I would expect way way higher throughput.

Another thing to try would be to batch writes - instead of writing every time something happens you collect writes in a queue and write every 50ms in one big transaction. This should help a lot with write throughput.

Reading can be done in the same way.

If you want to avoid parameter limits you can use JSON1.

Last but not least, analyze the queries using `EXPLAIN QUERY PLAN` to make sure your indices are correctly used.

Luxas 2mo ago

Using a VPS is sort of antithetical to what I was trying to build 😅. By being entirely “serverless” it wouldn’t have a single-point of failure or other downsides such as needing to ensure you harden it, manage updates, fault tolerance, smalllr resource ceilings, etc. There’s none of the traditional nuisances with Cloudflare Workers, etc. Everything is managed.

With adding Cloudflare Queues, the write throughput is now 5x better and it does batches of 100 to the D1. The next thing on my list is to ensure further optimizing the handling of all queries and improve read performance even more efficiently than just with indices and read replicas.

Kevin Åberg Kultalahti 2mo ago

Yeah - totally get that, just curious 😅

Nice! Hope it works out 💪

Luxas 1mo ago

Thanks! I've also been working on a non-SQL version as well. Will is right and wrong about SQL for Nostr. At scale, SQL does start to crumble. With a medium sized relay (or even semi high trafficked), it's completely fine when indices are used. But at the scale of what diVine was trying to do it wouldn't be the best performance. There were also a lot of things wrong from the beginning too where customizations were added, but accompanying indices not built for the filters/tags that would be used for every REQ.

Alex Gleason 1mo ago

I built a working Postgres relay (Ditto) but the only way tag queries would work was with a jsonb column on nostr_events table with a gin index. I have never seen tag queries work correctly if the tags table is separate and needs a join. It only works if the data is together in the same table.

jb55 2mo ago

sql for nostr doesn’t work. It’s been tried.

Kevin Åberg Kultalahti 2mo ago

I see! Is a document based DB the way to go? In memory?

jb55 2mo ago

strfry/nostrdb model is the only thing that has worked. lmdb + indexes that cover common use cases:

- created

- pubkeyKindCreated

- kindCreated

- tagKindCreated

Bonus points for storing json string and flatbuffer version of the note for optimal performance

Kevin Åberg Kultalahti 1mo ago

Intersting! Did not occur to me to use a kv solution for this.