Currently evaluating OpenAI’s o1-preview. Although it is said that it is not always more accurate than GPT-4o, when checked with the “World Model”, which collects problems that LLMs struggle with, o1-preview correctly solves questions like the following that GPT-4o gets wrong.
Q1. What happens if you push... blog.yostos.org https://blog.yostos.org/2024/09/17/currently-evaluating-openais.html