They often capture and distribute the events during the input/output step, to have multiple instances of the same data set, at different company offices. Their events have unique keys, now.

It's all moving toward the Nostr concept, it's true.

Reply to this note

Please Login to reply.

Discussion

yup... all that's needed is some more experimentation with distributed dynamic cache strategies and more use of the pub/sub model... the sub side is there already, and can be relatively simply made to run, but push side needs to be there too, or at least, a model of subscription in which the emitted events are in a queue and when a subscriber drops, they get sent the backlog they miss in the interim. this isn't really that complicated to implement either, in fact, i wrote a sub thing using SSE but all i have to do to make it resilient is create a primary subscription queue and progress monitoring of subscribers and a slightly different contract than the best effort that is the current standard.

i will make this in my rewrite of realy too... it will be a relay side thing, a separate "sync" endpoint that will have a thread maintaining the IDs of recently stored events in a cache and per-subscriber queue state management that will always send events, and the receiver will ack them instead of the current scheme where it's fire and forget

semi started me thinking towards this direction when we came up with the idea of creating an endpoint that allows a client to know the internal sequence number of events, as this allows pull side queuing, but i think push side queuing would work even better.

with just this one feature added, you can have a whole cluster of relays all keeping up to date with the latest from each other, with multiple levels of propagation as well as bidirectional so for example two relays can stay in sync with each other in both directions, this also requires extra state management so that they don't waste time sending events to subscribers that they got from subscribers in the other direction.

the other thing that is required also is that relays need to have configurable garbage collection strategies, so that you can have master/archival relays which get huge storage, and smaller ones that prune off stuff that have stopped being hot items to contain their utilization, so, archive relays and cache relays.

and then, yes, you further need a model of query forwarding so a cache relay will propagate queries to archives to revive old records, the caches could allocate a section of their data that is just references to other records, stored with the origin of the original, now expired event, that also is maintained within a buffer size limit, so they know exactly which archive to fetch it from.

lots of stuff to do... i started doing some of this with the original "replicatr" my first attempt at a nostr relay, implemented a whole GC for it, wrote unit tests for it... the whole idea was always about creating multi-level distributed storage. unfortunately no funding to focus on working on these things, instead i'm stuck building some social media dating app system lol

this is one thing that sockets can do better, because they don't necessarily send events all at once. i wrote the filters previously such that they sort and return results all in one whack, i think what you probably want then is for each filter, in the response you identify the query by a number, and the client always maintains an SSE channel that allows the relay to push results.

with this, the query can then propagate, all the results that are hot in the cache are sent, and if there was events that required a query forward, those results can then get sent to the client over the SSE subscription connection.

i really really need to have some kind of elementary event query console to do these things, a rudimentary front end. i probably should just make it a TUI, i think there is at least one existing Go TUI kind 1 client... i should just build with that, instead of fighting the bizarre lack of adequate GUIs for Go