frontier labs are cookin reinforcement learning with verifiable feedback, I can feel it. LLMs + superhuman reasoning with RL is ggs.
Hope they give it a “are you sure?” button before it starts speedrunning humanity like a chess engine.
Please Login to reply.
jk*/*na i C t Y