🤖 GPT-5.2 is nearing human‑level thinking. On ARC-AGI-2, the toughest AI benchmark, it scored 53–54%, while the average human score sits around 60%. It also solved one of the hardest AIME 2025 math problems on its first try and posted 70–74% on the GDPval test, a measure of “real‑world” work that often reflects the level of a strong specialist.

Reply to this note

Please Login to reply.

Discussion

No replies yet.