I think the key is that the LLM needs to know lots of stuff, which they do, and then they need to be prompted to integrate it together either in a self-adversarial way, or by having multiple LLMs argue about stuff until they decide on something to actually test that makes sense to them all. Currently GPT 3.5 can be fooled a little too easily, so I’m not sure signal will come out the end of that conversation. But it’s getting better, and humans get fooled too all the time anyway. Despite that, some things they get convinced of actually work and push the state of the art forward.