What sort of data lake are you using?

Also, curious how you're querying it all, Rust?

Reply to this note

Please Login to reply.

Discussion

Just with my laptop, storing the data in a PostgresDB. I’m using python picking from different repos that I found online. Happy to share more details

You going to make some cool metrics/dashboard or something? I imagine nostr is hard to map without something like this.

How are you doing relay discovery?

You just ping this endpoint and will retrieve all online relays https://api.nostr.watch/v1/online

I’ll do some stuff with that data .. not with a specific plan but I have some ideas

For now I collected all my follows pubkeys, their follows and their follows’ follows which totaled around 30k pubkeys

I finished querying online relays and I was able to extract 45 days of all events from those pubkeys from about 200 relays .. there were some relays I couldn’t connect and some that didn’t have any data

I will now explore things like: how my posts propagates in relays I didn’t publish.. or what are the relays with most of the activity or what are the pubkeys in those 30k that make 80% of the activity .. etc etc

So if you have any idea happy to consider including it

I may.. if I have time .. clean that script and make it public so others can play around if they want to download about 5 gb of events :)

Bookmarked! 🐶🐾🫡

I always think of the spider like graph display of things. Showing the social “web” of connections. Relays + pubKeys