Nostr Web Client

This doesn't solve for the versioning of replaceable articles, but at least we have an additional free-to-use relay for our various clients.

PABLOF7z 1y ago

I've been meaning to write a relay that keeps multiple revisions of replaceable events.

hopefully in a couple weeks I'll have the time, it should be fairly straightforward and clients that can't handle revision wouldn't break (they already need to filter for the most recent event from multiple relays)

nostr:npub1fjqqy4a93z5zsjwsfxqhc2764kvykfdyttvldkkkdera8dr78vhsmmleku I know you were interested in this too, have you done any work in this direction?

Reply to this note

Please Login to reply.

Discussion

Laeserin 1y ago

nostr:npub12262qa4uhw7u8gdwlgmntqtv7aye8vdcmvszkqwgs0zchel6mz7s6cgrkj

Laeserin 1y ago

I had the idea of a 20/20 rule. Keep newest 20 revisions of replaceable events or (if they have fewer) keep revisions for max 20 months.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

writing a GC is a solution that covers more cases with less special cases though, all it requires is a last accessed timestamp

Laeserin 1y ago

GC?

PABLOF7z 1y ago

Laeserin 1y ago

Ah, garbage collection. 😅 Thanks!

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

garbage collector

you track each event's last access time and then when you do a GC run you grab all the event serials, along with their last access timestamp, collect the size of each of these events, sort them from lowest to highest, total it up, then count off as many events of the oldest last access that hits your "low water mark" target and then delete them

in badger db i did this in a "batch" transaction which uses a divide and conquer to break the operation into a series of parallel operations (like, ideal is number of CPU threads) and it happens so fast

by doing it this way, you solve multiple problems with one action, and events that haven't been accessed in a long time are the best candidates for removal... old replaceable events will naturally fall into that because clients only mainly want the newest version and most of the time accessing old versions would be an infrequent event and more often you would only want the next oldest or so anyway, so they will expire from the cache without any extra rules or logic

Laeserin 1y ago

Ah, very clever.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

it's how Go manages memory, pretty much, their algorithm is substantially more clever than mine and refined over 15 years, i would eventually want to make it dynamic so it reacts to bursts in traffic and spaces out and shortens the length of GC passes to minimise latency for client requests during heavy traffic, all of these are similar cases handled by the Go GC, which, BTW, was mostly engineered by the guy who built the V8 javascript engine, the basis of nodejs and chrome's javascript interpreter

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

yes, i removed the replaceable delete function, you can literally just write a filter that asks for an unbounded number of replaceable event kind associated with a pubkey and the newest version comes first and all the rest

you'll find it's really easy to modify any relay to do this, just change the special cases for replaceable events

as someone pointed out to me already, clients already sort and pick the newest if multiple come back from these queries, so it's just a matter of removing that racy delete

also, deleting events other than when the author (or admin) sends a delete message is a silly idea, much better to have garbage collection that just scans from time to time and removes the most stale versions long after they are out of date

if clients were more savvy with this, they could easily implement rollback when you make a wrong update

PABLOF7z 1y ago

yes, this rollback / revision control is what I wanted to implement in wikifreedia for weeks now but hadn't had the time to modify my relay

are you storing full events locally or are you storing deltas and computing the full payload when serving them?

do you have this running on replicatr? any URL I can test on?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

storing full events

you have to enable the GC size limit for it to have a high and low water mark, and you can additionally configure that if the defaults don't fit your case, and even further, you can create a second level data store, which would presumably be a shared data store that is accessed over the network, and the headroom above the high water mark will then store the indexes of events that have fallen out of the local cache but still allow fast filter searches

https://mleku.net/replicatr is the core, which is a fork of khatru, and https://mleku.net/eventstore is the eventstore with the GC enabled for the "badger" and there is a "l2" event store that lets you plug in two event stores, one is usually badger, and the other can be anything else, and there is a "badgerbadger" which i wrote using two levels of badger event store, one with GC on and L2 enabled that tests the GC once your event and index storage size exceeds the size limit

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

btw, fiatjaf is wrong about badger, he just doesn't know how to use it or write good, bugproof binary encoding libraries... the batch processing functions are incredibly fast, like, 15gb of database can be measured in ~8 seconds and if a GC pass is needed that might take another 5-12 seconds deponding on how far over the limit it got

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

also, yes, that will scale, on a 20 core threadripper with 40Mb of cache and a 128gb of memory it would probably zip through that job in less than half that time

PABLOF7z 1y ago

how much has replicatr and your eventstore deviated from khatru and fj's eventstore? is it a drop-in(ish) replacement? almost all my custom relays are based on khatru.

do you have NIP-50 support on your eventstore? I needed to add that for wikifreedia's search

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

the eventstore is almost drop-in except for the definition of the (basically identical) eventstore interface

most code written to work with khatru's arrays of closures can also be quickly adapted

no, i haven't got to doing - full text search, right? it requires writing another index, though that may be easier to get happening sooner if you use a DB engine that already has that as a turn-key option

the Internet Computer database engine has some kind of complex indexing scheme on it and likely would be easy to make it do this but the badger event store is bare bones, all it is built to do is fast filter searches and GC... it would not be hard to add more indexes but it would be a couple month's work i'd estimate

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

well, i think i could get MVP in 1 month anyhow

Laeserin 1y ago

nostr:npub1gnwpctdec0aa00hfy4lvadftu08ccs9677mr73h9ddv2zvw8fu9smmerrq nostr:npub1wtuh24gpuxjyvnmjwlvxzg8k0elhasagfmmgz0x8vp4ltcy8ples54e7js

Laeserin 1y ago

nostr:npub18kzz4lkdtc5n729kvfunxuz287uvu9f64ywhjz43ra482t2y5sks0mx5sz

Slipstream 1y ago

curious if it would still honour delete requests or no?

Laeserin 1y ago

Yes, but I will go add delete events specifically to the whitelist.

Laeserin 1y ago

Done.

fiatjaf 1y ago

Maybe you could return just the latest event by default and only return the history when asked for with a limit > 1 or some other criteria.

PABLOF7z 1y ago

yup, this is what I had in mind too, but mainly to avoid sending more data than most clients will probably use

fiatjaf 1y ago

Hmm, but it doesn't make sense to specify a limit when you want (the latest version of) multiple replaceable events.

Nostr's crappy querying language fails again. We need JOINs.

fiatjaf 1y ago

It's probably better to have a special relay or a special subdomain just for the relay that archives stuff though. And then clients should know to use that when they want old stuff.

Laeserin 1y ago

Yeah, I wanted to set up the archive, but I need someone more familiar with relays and archiving to do it, as that isn't really our area of expertise. And blows up our meagre budget. 😬

Would be good to have at least one public archive relay, in addition to the couple of public "other stuff" relays we now have.

PABLOF7z 1y ago

yeah, I think the only moment where you would return multiple versions is when you're being queried for something in particular

kinds: [30818], pubkey: [fiatjaf], #d: ["ipfs"], limit: 10

perhaps this warrants adding a new filter?

kinds: [30818], pubkey: [fiatjaf], #d: ["ipfs"], revisions: 10

fiatjaf 1y ago

Indeed, that solves it.

fiatjaf 1y ago

Not the new filter, I don't think it's necessary at least for now.

DanConwayDev 1y ago

This would be great for contact list recovery on metadata.nostr.com