Nostr Web Client

Replying to

johnny

Give that power to the user.

nostr:note1ny7z66uy4c7ajkmg2wh0z8uf05857adqc7jq4rvetm8cwtg5gcyszr35kc

jb55 2y ago

Expiring notes will not likely save that much space. Lists use the most space (mute lists, contact lists, etc)

Reply to this note

Please Login to reply.

Discussion

ben 2y ago

any other observations or insights after running this thing for a while?

Blake 2y ago

The past 30 days are the most hot data for kind 1. The drop rate for likes, replies and zaps is huge. Rarely much happens after a week.

Unless you’re doing search, at present, you likely wouldn’t need it to keep events forever. There isn’t a great way to discover old events unless you search or someone posts or replies to one (usually from a user profile timeline).

Long form content likely needs a longer lifespan. Maybe creators will repost or send a tweak edit to keep them in relay DBs. Likely creators will pay to keep it available.

If you replace events like kind 0/3/10002 or 30000 range, you’ll significantly reduce data. I’ll get the stat, but kind 3 data is 5-10X all kind 1 data, which is the second highest (that’s keeping the old kind 3 events too).

Spam makes up 90% of event volume, when unfiltered.

So far most active users have between 4,000-15,000 events total. Often that’s less than 10MB each.

There are around 8-10 too relays, and then a whole heap of mid tier relays that have a lot of events, but aren’t syncing from other relays.

That’s some general stuff I’ve seen anyway.

Mazin 2y ago

Good insights, thanks #[6]

Blake 2y ago

This is solely the raw json data. When you add other columns and extract out to other tables and add indexes… my Postgres DB is at 130GB. Very little spam in there.

Keep in mind I haven’t purged old kind 3, because I generate change in followers over time graphs.. but I sometimes need to re-generate the data while I improve it.

Blake 2y ago

Oh. And I don’t persist kinds in the 20k range. I suspect the kind 5 delete count are high due to spam as well. Likely some historic channel spam kind 42 as well.

johnny 2y ago

What a great contribution. And with data 👏🏼

Louis Libre 2y ago

Interesting. Lists could be optimised in the backend by indexing the pks and storing only the lists of indexes.

Blake 2y ago

I do this for tags. The issue is relays serve json, and unless you store the json in a ready format, generating json events on demand is very computational from lots of refs/joins.

johnny 2y ago

I understand.

39a7d06e... 2y ago

In my own relayer, soon to be open source, I'm storing things as raw JSON. I am thinking to compress and store as binary (same format as the data to be signed). Maybe that will save some percentage.

jb55 2y ago

I think strfry does this in flatbuffers ?

DULA 2y ago

Whichever post get the most 🤙🏻 will get to stay forever, shit posting will go to trash 🚮 like self cleaning 🧼 process! 💭 that’s and idea 💡.

39a7d06e... 2y ago

Flatbuffers is a good choice. As long that it has a strict schema (to avoid storing metadata, like BSON/JSON).

Although my relayer is built in Rust, my top priority is to launch it as soon as possible. Once it's up and running, I can then focus on optimizing it further. The main requirement for the relayer is that the signature matches the reconstructed event, and that the content is compressed to minimize bandwidth usage.