Global Feed Post Login
Replying to Avatar Dustin Dannenhauer

Right, with respect to how many RLHF pairs, you can compare it to prior result of ~0%. But what I don't understand, and especially with the hype claims the paper makes about "autonomous super-human reasoning", is why can't they just keep running it and get much higher than 50%? Seems like there's another aspect that is preventing getting higher scores, and makes me wonder if these architectures are really just plateauing.

Don't get me wrong, it's some good work; it's just the language of the paper has some ridiculous hype.

Avatar
ynniv 7mo ago

Ah. It's true that everyone wants to claim the world. It isn't my work, I'm just plotting points and drawing lines

Reply to this note

Please Login to reply.

Discussion

No replies yet.