Nostr Web Client

#introductions Hi nostr, I am Bob. I am starting a project called Nostrchive. I have the crazy idea to collect and archive as much nostr data as possible with the goal to pre-process (collect, collate, organize, and tokenize) data for (re)training FOSS nostr-aware LLMs for nostr search. Any strfry relay operators who might consider whitelisting my archive strfry relay for negentropy connections, please reach out. I would like to identify optimal batch sizes and connection windows in UTC. Thanks for your consideration.

Reply to this note

Please Login to reply.

Discussion

d0708145... 1y ago

Cool idea. Welcome to #Nostr🤙🏻.

paulo 1y ago

Welcome to Nostr 👋

Bohemia 1y ago

Hi Bob! Welcome to nostr 💜 what an interesting project!

Ricemoon 1y ago

Reconsider, Bob...

QW 1y ago

Welcome Bob 🤘🏻💜

Followed for more big brain ideas.

#Plebchain

daniele 1y ago

Welcome Bob!

Nostr search is an hot topic, keep working on it.

Ryan 1y ago

Welcome, sounds like a great project!

⚡️CYB3RX⚡️ 1y ago

Welcome to #Nostr Bob!

Paul Sernine 1y ago

Hi Bob & welcome to this brave Nostr world! Sounds like an interesting idea — hope you‘ll succeed! 🍀 Hope you enjoy the ride & have loads of fun along the way 🎢🚀🫂💜

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 1y ago

more of this

we don't need a lot of archives, a few is enough, but it is very necessary

the company i work for is going to be building a full text search for events also, though we are more focused on winning corporate customers to use nostr based infrastructure

Bob 1y ago

Hi nostr:npub1fjqqy4a93z5zsjwsfxqhc2764kvykfdyttvldkkkdera8dr78vhsmmleku

Just thinking here... Building search is really hard, and I am sure I am not the man for that job; however, I like to organize, analyze, and automate. I also have a large symmetric connection of which I can only really use 25% for IRL, so I thought to myself; what useful service can I create for nostr with my excess capacity? I am not sure I will be able to host a proper relay archive available to the public as managing a single large relay database would be unwieldly. It might be possible to host particular curations of the data as separate public relays, though.

My main focus is segmenting and archiving the data. This seems achievable, manageable, and open to automation. I believe this will serve as a useful foundation for projects needing large nostr datasets for LLMs. I expect I can make this segmented data available as periodic updates to the public. Early stages, building out the garage data center off of local surplus in the Bay Area.

johnny 1y ago

Hi. Welcome to #Nostr 🟣

👏🏻