Nostr Web Client

Huggingface offers a free API to train LLM weights. You can also use most VPS providers to access a GPU powerful enough to do it 100% custom. As the demand for this goes up the price will come down and frontends that make it more accessible will be open source and readily available.

Additionally LLaMA was a massive breakthrough in dramatically reducing the compute power required to run LLMs. I run Alpaca 7B locally on my phone.

Bourbonic Plague 2y ago

Running with the trained weights is cheap. Training new models that are are based on the existing weights is relatively cheap.

But Stable Diffusion was trained on 4K A100s and LLaMa used 2,048 A100-80GB cards for training. ChatGPT 4 is rumored to be 225B parameters and in a stream a few months ago I think George Hotz mentioned how many GPUs he heard it used for training, but I can’t remember what he said.

Ideally the costs will drop even faster than Moore’s law with technical advances in the training methods, but it’s still very expensive to train new models from scratch.

Reply to this note

Please Login to reply.

Discussion

Pablo Xannybar 2y ago

You're not wrong, but the things we've seen done with the leaked Alpaca weights alone are very impressive.

And as LLaMA has shown, advances in efficiency are rapid and exponential. Before the Alpaca weights leaked, no one thought they'd be running an LLM on a phone or a Pi in 2023 yet here we are.

Same applies to the training - obviously we're no where near the level that allows it to be done cheaply on consumer hardware, but it's in the interests of all involved (including OpenAI, Meta, Google, etc.) to make those processes as efficient as possible too.

I think there will be a "slowly then all at once" moment there too, like Alpaca did for running weights locally.

In the meantime, even using what we have right now, those Alpaca weights are insanely powerful and run on regular consumer hardware. Which is massive. It also means it's very easy to train a model without intentionally programmed bias (obviously you'll still have bias inherited from the training data) and without the annoying morality filter "as an AI language model..." bollocks.

I asked LLaMA on my phone how to steal a car to test it out and it gave me a list of ideas.

Bourbonic Plague 2y ago

Yeah, all good points.

But the morality filters in Chinese models aren’t bolted on after the training the way these are. The Great Firewall is their AI morality filter.

It was surprising to me to hear that insight come out of her no matter how incoherently she said it. I don’t think they will achieve their goals, but it’s upsetting to hear them articulate them so plainly in public like that.