Nostr Web Client

> also, SQL is a stupid way to store events when you don't need all that advanced search logic nor that many tables, plus it would be space inefficient due to all the extra indexes that come by default that are probably not possible to turn off even though the query logic doesn't need it

Sure, but it scales. I can stand up a dozen relays and load balance the databases with replicas. Might not be the most efficient, but I can make up for that with more of them. We need to be able to run dozens of relay instances off the same backend.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

nosql and kv stores also have scaling strategies too, and use one of the several common replication protocols, you just been in this a while ;)

Reply to this note

Please Login to reply.

Discussion

ChipTuner 10mo ago

You might be right. But no relays are using scalable KV services though? At least none that i've seen?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

nope but i know i could add a two tier fewer larg many smaller caching relays strategy it just requires writing one implementation of a library i wrote already and tested for a different second level "master" cache

ChipTuner 10mo ago

I would really like to be able to "load balance" relays meaning they are stateless and use a backend that I can administrate on my own. I don't want to rely on the relay software itself to admin the database. Again where SQL wins. As an admin I can do optimization outside of the relay environment.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

this is a bit like the debate about why bother enabling JWT bearer tokens (for dumb old epaper readers and postman btw) versus just use nip-98 (which doesn't have an extra expiry field)

i'm not sure what you mean by "stateless" relays, unless you mean they are dumb stores and the "master" pushes and pulls data from them, ie, it would subscribe to all new events from the slave relays, and then push them to the others

the tools to do this are not created but thety are also quite trivial, and can be built as separate pieces, you can have a replicator server that just subs on all the relays and pushes to the master, and the satellites can just forward queries when they come in and push the newly added updated events that came in from other satellites over the subscriptions that clients opened up for matching events

i think it would be better if you made them all simply replicators so each of the load balance relays simply pushes new events to the centre master and the master broadcasts them to all the others, instead of the master pulling them with subscriptions

subscriptions are cool and all, but broadcast is coolerer

it's all teh same to me though, point being that a relay IS a database server already

cloud fodder 10mo ago

strfry can scale quite large.. larger than what anyone needs right now. you can setup replicas and load balance between them. if you wanted to you could shard them, but you would need a sharding proxy.

either way, i think nostr scales with more people running relays and using outbox, so thats my focus (not as much mega scale single relay DB because i dont want to be like bluesky)

ChipTuner 10mo ago

Pleb relays is the way, but some of us are going to have to handle thousands of concurrent users (hopefully)

strfry uses and embedded db doesnt it? lmdb iirc?

How can I run multiple strfy instances from the same db?

ChipTuner 10mo ago

It's also not just for scaling, but for redundancy too. Were talking experimental and poorly tested at best, software here.

cloud fodder 10mo ago

yes, it uses lmdb. you can run multiple processes of strfry that use the same db directly on disk if you wanted that. to scale horizontally you can replicate the db and have a load balancer in front. sure its not transactional like mysql is, but it is eventually consistent like a larger nosql db would be.

ChipTuner 10mo ago

well i'd have to share it across a nfs or cifs share which historically has been shit for things like that. Server disk space is expensive. I would much rather have "centralized" db servers in a cluster and have them available for services to connect to them.

Id like the instances to be deployable in an HA environment so the disk can't be shared since it's not on the same server.

cloud fodder 10mo ago

you would not want to share the db this way no, you would use strfry protocol (negentropy/stream/router) to keep the replicas in sync. it would be ha. every node the same..

ChipTuner 10mo ago

But that means the db would be duplicated on every node and I would be relying on strfry to keep things in sync. Outside of my ability to interact with it using standard tools like weve been using for decades with sql.

My issue is relying on a service that is designed to be a relay, also trying to be a database system. I think it's just too much to ask and can't be done well.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

i don't know what you mean by "load balanced" to be honest

nostr relays are really just a database server in themselves

do you want to replicate the database, do you want to demand cache it, do you want to shard it? it's quite important what strategy you have in mind and whether that actually fits the use case

sharding is probably gonna need to use social graph and maybe combine that with geofencing

caching is the simplest, dumbest idea, where the relays that people use don't actually store much data but they proactively fetch new data and expire old data quickly to keep the space available

caching was the way i envisioned working it... you could extend it with broadcast so that new data is shared around immediately and then with the garbage collection, rapidly purged from the store when it gets no traffic

how you design that optimization depends entirely on how the data is going to be used, and propagated

inherently, replicated, RAFT/Paxos/pBFT databases ARE exactly replicas, you even used the word replica, you didn't mean replica because you just clearly said you didn't mean replica

replica is like a bitcoin node with the same blockchain, that's a replica, all replicated database protocols make replicas, and individual replicas don't have the option to decide not to store data arbitrarily

what you are talking about is a caching strategy, which means you want to do garbage collection and broadcast

cloud fodder 10mo ago

there *are relays that use SQL.. like khaturu or ditto.

cloud fodder 10mo ago

i know gleason spends lots of time and probably money on servers, performance tuning the queries 😁 nostr will absolutely slam sql databases and is typically much more expensive to run a sql backend than lmdb.

ChipTuner 10mo ago

ditto uses postgress last i checked and seems like it still does. Closer I guess. khaturu seems to be the best option, although I have to manage a build myself if I want to connect it to my infra. Midly acceptable.

cloud fodder 10mo ago

yeah, people seem obsessed with postgres, i think they must enjoy the pain. unsure if they can be adapted to mysql, probably not once theyre done tuning it.

cloud fodder 10mo ago

there is also relays using mongodb..

ChipTuner 10mo ago

Yes grain does. truely not a fan of mogo myself though. I guess I'm alone on SQLServer world XD. I use it for everything, but maria db is close second except it's column search sucks huge ass compared to sqlserver.

cloud fodder 10mo ago

katuru has pluggable backends so it may not be too hard to add one..

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

yeah, except you get to have fun with fiatjaf's shitty concurrency and slow ass json codec along with khatru

i've firmly decided that after i finish this JWT bearer token stuff and implement the HTTP endpoints that all features cease after that and i build out a modular scheme for them all , and untangle all the entanglement

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 10mo ago

oh and i forgot to mention fiatjaf's shitty event store interface which assumes you want to deal with channels and several more goroutines than you need, which make an utter hellish mess

but have fun with htat anyway

i will try and build out a fully simple architecture to make it all easy to include or not include, very soon, it's driving me nuts, i made the first part nice, now feature adding, ok, too much, i'm getting claustrophobia

ChipTuner 10mo ago

I have a feeling hes not using an orm system. Sketches me out considering SQL injection is still way higher up on the list of vulnerabilities than it should be.