any relay operators willing to share a db dump? looking for a nostr dataset for WoT experiments. #asknostr

Reply to this note

Please Login to reply.

Discussion

So interested in hearing your results.

Try nostr:npub1jlrs53pkdfjnts29kveljul2sm0actt6n8dxrrzqcersttvcuv3qdjynqn. He is more responsive, and if he can’t help you he probably knows someone who can.

I don't run any public relays, otherwise I'd hook you up, this kind of research is interesting

what say you nostr:npub12262qa4uhw7u8gdwlgmntqtv7aye8vdcmvszkqwgs0zchel6mz7s6cgrkj ? willing to contribute data to some data science?

Sure, I can give you 269 (91%) kind 5 delete events and 18 kind 1 events from almost couple of days of running open relay if that helps. :)

Sure

Though this prompted me to check my relay's db, which I was just using as a personal backup, and found it had 3 million events (1.2 million kind 1). I know it's 99% spam but 3 million?! That seems like a lot to me

I’ve seen lots of kind:0 spam too. is your relay a large public relay or just your personal backup? Im looking for the former.

It's a personal backup though now I'm curious, how many events do the large public relay store? Tens, hundreds of millions?

You could also use tool like this and collect a lot of data from all open relays to yourself. It also gives you control over what kind of events you want to get.

https://github.com/fiatjaf/nak

If you want to be 100% sure that your data is full dump then this might not be enough but other than that, this would basically allow you to collect as much data as you want.

For example this command dumps all events from my relay in json format that can be imported to any database you like or be processed as is.

> nak req wss://relay.reinikainen.in

You could also filter it like:

> nak req -k 1 wss://relay.reinikainen.in

...to get just kind 1 regular text notes and ignore everything else.