#asknostr

I had an idea last night about distributed data preparation for AI training. AI models are trained by RLHF with biased individuals, fuck that. Let's distribute this shit, here's an idea:

- Plebs host data relays that scrape and process data from Nostr to a standard (read: in-code) Nostr note type

- distribute some H100 instances to 'slowly' train a public LLM (leveraging DeepSeekV3's approach to distributed training)

- training slowly to agree on model updates: 2 weeks collect new processed data notes, 1 week to train on the new data, grab timestamp from the best timestamp server available (#Bitcoin blockheight), 1 week to distribute new LLM weights. Repeat.

This let's plebs focus on data they are knowledgeable about - hopefully get enough coverage for a general LLM - and should give a public AI advocate a shot... No more censored BS LLMs gaslighting us.. let's take this shit back

#GM nostr, have a good day and God Bless

Reply to this note

Please Login to reply.

Discussion

The problem with distributed AI training is latency. Some work has been done on this by Petals, worth looking into. #BOINC is an OSS/protocol that could be used for this as well.

Interesting, I’ll take a look at those- I was thinking it might be good to keep the training deliberately slow- ie train over a week or two to enable the LLM to see globally distributed data and train on that- similar to how Bitcoin keeps blocks at 10 min- what do you think?

I think you need to do some entry-level background research into how AI training works.

Sorry that came across as sassy. Your idea is interesting, but I think not really achievable using current training methods.

Have you looked at the training methods used by deepseek for their recent model? It splits the training up via horizontally per node with this DualPipe. From link (https://adasci.org/deepseek-v3-explained-optimizing-efficiency-and-scale/), might work at more distributed scale as it needs close to zero all-to-all communication

I would think it would still run into latency issues, I get that this can be run over multiple nodes at once, most ai training is, but the assumption is that those nodes can rapidly communicate with each other when they need to. But certainly progress is being made on this, hopefully it's only a matter of time until we have decent distributed low-latency training available. If you are training a public AI model, there are petaflops of free computational power available through BOINC and I encourage you to explore it. There is also a cryptocurrency (Gridcoin) which incentivizes participation in BOINC that can be used to instantly get you a ton of free compute volunteers.

Nice will look at it- appreciate the back n forth 🫡