My query yo hundreds of relays for thousands of pubkeys is still running after more than a day.. but is about to finish I hope tonight

I downloaded gigabytes of days from millions of events .. let’s see what I find out

Reply to this note

Please Login to reply.

Discussion

The 80% be like

"Thanks me later"

No creas eh.. ya mañana me dedicaré a ver bien

Contá que tal! Esto es todo lo q esta a 3 grados de relación de tu pibkey?

Si nada más que ahora lo estoy haciendo bien y por eso me está tardando más. Pero voy a ver varias cosas ahí que iré compartiendo

Por ejemplo en cuantos relays están mis notas.. cuáles relays tienen mayor cantidad de eventos.. y luego ver cosas de los que sigo.. quienes son los que más escriben y como es la distribución etc

Si hay algo que te da curiosidad decime y veo si lo puedo sacar

En principio ver si se confirma la regla de pareto, el 20% del userbase representa el 80% de las notas

What sort of data lake are you using?

Also, curious how you're querying it all, Rust?

Just with my laptop, storing the data in a PostgresDB. I’m using python picking from different repos that I found online. Happy to share more details

You going to make some cool metrics/dashboard or something? I imagine nostr is hard to map without something like this.

How are you doing relay discovery?

You just ping this endpoint and will retrieve all online relays https://api.nostr.watch/v1/online

I’ll do some stuff with that data .. not with a specific plan but I have some ideas

For now I collected all my follows pubkeys, their follows and their follows’ follows which totaled around 30k pubkeys

I finished querying online relays and I was able to extract 45 days of all events from those pubkeys from about 200 relays .. there were some relays I couldn’t connect and some that didn’t have any data

I will now explore things like: how my posts propagates in relays I didn’t publish.. or what are the relays with most of the activity or what are the pubkeys in those 30k that make 80% of the activity .. etc etc

So if you have any idea happy to consider including it

I may.. if I have time .. clean that script and make it public so others can play around if they want to download about 5 gb of events :)

Bookmarked! 🐶🐾🫡

I always think of the spider like graph display of things. Showing the social “web” of connections. Relays + pubKeys

Que experimento andarás haciendo!

Curiosidad nada más para ver cómo se distribuye la info entre relays.. ver cómo se distribuyen los eventos entre usuarios y varias cosas más .. todavía no se bien.. solo estoy mandando un query de más o menos 30 mil pubkeys a más de 200 relays y le estoy pidiendo todo lo que tienen de los últimos 45 días