The past 30 days are the most hot data for kind 1. The drop rate for likes, replies and zaps is huge. Rarely much happens after a week.
Unless you’re doing search, at present, you likely wouldn’t need it to keep events forever. There isn’t a great way to discover old events unless you search or someone posts or replies to one (usually from a user profile timeline).
Long form content likely needs a longer lifespan. Maybe creators will repost or send a tweak edit to keep them in relay DBs. Likely creators will pay to keep it available.
If you replace events like kind 0/3/10002 or 30000 range, you’ll significantly reduce data. I’ll get the stat, but kind 3 data is 5-10X all kind 1 data, which is the second highest (that’s keeping the old kind 3 events too).
Spam makes up 90% of event volume, when unfiltered.
So far most active users have between 4,000-15,000 events total. Often that’s less than 10MB each.
There are around 8-10 too relays, and then a whole heap of mid tier relays that have a lot of events, but aren’t syncing from other relays.
That’s some general stuff I’ve seen anyway.