Hi fiatjaf, yes, that’s the one. At 2:00 am “compact” somehow became “compress” in my head. But you’re right, other than making certain “immortal” events deletable on LMDB as per your original intent above, for the end user, it’s essentially a major database-wide deduplication (which is exactly what I was looking for with BadgerDB).
You’re also right that nuking the database and reimporting old notes has the same effect. This is what I’m suggesting for Haven users for now. Unfortunately, Haven can’t import its own backups (yet), but users can always reimport some of their old notes from other relays or temporarily use a second instance of Haven to do this. I was just considering a Haven-specific --compact flag for completeness (e.g., so users don’t lose private notes that aren’t currently reimported) and to save them the trouble of doing this manually.
Either way, awesome work. Many thanks! Haven is absolutely flying with the new Khatru engine. I even tested this with an old LMDB database backup I keep around for testing purposes. Compacting the database cleared out over 2 million duplicsted events from a database containing only around 1k short notes. It’s impressive how clients continuously spam lists, sets, etc. Now I know there was much more to it than just the Amethyst kind 10002 write loop bug.
Replaceable events should only be used for things that are written sporadically. There some shady stuff being done out there with these, I think it's wise to only allow some explicit whitelisted kinds that we know aren't spammy.
💯 To be fair, I’m encountering broken client behaviour or client/relay incompatibilities that result in spammy activity with otherwise "legit" events more often than actual malicious code or directed attacks. But you’re right, we should be doing something about it. Whitelisting specific events is a good start. Maybe I’ll build a Citrine-style dashboard for Haven so users can at least get a sense of what they’re storing in their relays. From there, we could add functionality for deleting individual events, deleting all events of a certain kind or even blocking them entirely.
For now, though, ReplaceEvents are doing a great job of preventing unnecessary database bloat. Again, many thanks.
I'm not talking about malicious stuff, but things like Amethyst draft events that rewrite the same addressable a thousillion times (I'm not sure this actually exists but I've heard it is a thing).
I’ll have a look deeper. I haven’t paid much attention to Haven’s private relay since Inbox and Outbox are always the ones on fire, but apparently, I only have three kind 31234 events (Amethyst-style drafts) across all my relays. Draft events are certainly high-frequency, but as far as I can see, they aren’t bloating the database.
List and set events, on the other hand, have been the bane of my existence. That, along with the fact that Amethyst still doesn’t send the right events to the correct types of relays, remain my top two unsolved tech problems on Nostr. Vitor mentioned he was working on it, but it’s a non-trivial change given how much functionality has been built on top of the classical general relay model. Fingers crossed, both clients and relays will see some improvements this year. I’d prefer these two fixes over any new features.
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed