What sort of data lake are you using?
Also, curious how you're querying it all, Rust?
What sort of data lake are you using?
Also, curious how you're querying it all, Rust?
Just with my laptop, storing the data in a PostgresDB. I’m using python picking from different repos that I found online. Happy to share more details
You going to make some cool metrics/dashboard or something? I imagine nostr is hard to map without something like this.
How are you doing relay discovery?
You just ping this endpoint and will retrieve all online relays https://api.nostr.watch/v1/online
I’ll do some stuff with that data .. not with a specific plan but I have some ideas
For now I collected all my follows pubkeys, their follows and their follows’ follows which totaled around 30k pubkeys
I finished querying online relays and I was able to extract 45 days of all events from those pubkeys from about 200 relays .. there were some relays I couldn’t connect and some that didn’t have any data
I will now explore things like: how my posts propagates in relays I didn’t publish.. or what are the relays with most of the activity or what are the pubkeys in those 30k that make 80% of the activity .. etc etc
So if you have any idea happy to consider including it
I may.. if I have time .. clean that script and make it public so others can play around if they want to download about 5 gb of events :)