The focus is not training... it's all about feeding more and more data to bigger and bigger clusters.

This:

https://void.cat/d/TaZLys7WMuwBWjAQHJK22n.webp

Reply to this note

Please Login to reply.

Discussion

Sure, but the thing is predicting something, right? Isn't there a process that tells it that his predictions are better or worse?

Yes. If you take a sentence and chop off the last word and then have the LLM predict the next word you can quickly check whether or not the guess was correct. Once you get a good word guesser you can fine tune it with Reinforcement Learning with Human Feedback (RLHF) to get something much easier to use by humans that’s more aligned.