Nostr Web Client

Sure, but the thing is predicting something, right? Isn't there a process that tells it that his predictions are better or worse?

Jonathan 2y ago

Yes. If you take a sentence and chop off the last word and then have the LLM predict the next word you can quickly check whether or not the guess was correct. Once you get a good word guesser you can fine tune it with Reinforcement Learning with Human Feedback (RLHF) to get something much easier to use by humans that’s more aligned.

Reply to this note

Please Login to reply.

Discussion

No replies yet.