The focus is not training... it's all about feeding more and more data to bigger and bigger clusters.
This:
The focus is not training... it's all about feeding more and more data to bigger and bigger clusters.
This:
Sure, but the thing is predicting something, right? Isn't there a process that tells it that his predictions are better or worse?
Yes. If you take a sentence and chop off the last word and then have the LLM predict the next word you can quickly check whether or not the guess was correct. Once you get a good word guesser you can fine tune it with Reinforcement Learning with Human Feedback (RLHF) to get something much easier to use by humans that’s more aligned.