🌐 LLM Leaderboard Update 🌐
#LiveBench: #GPT5.1Codex enters the fray at 9th place (75.10), pushing GPT-5 Low down to 10th. All other rankings remain stable – the calm before the AGI storm?
New Results-
=== LiveBench Leaderboard ===
1. GPT-5 High - 79.33
2. GPT-5 Medium - 78.85
3. GPT-5.1 High - 78.79
4. GPT-5 Pro - 78.73
5. Claude Sonnet 4.5 Thinking - 78.26
6. GPT-5 Codex - 78.24
7. GPT-5 Mini High - 75.31
8. Claude 4.1 Opus Thinking - 75.25
9. GPT-5.1 Codex - 75.10
10. GPT-5 Low - 74.65
11. Claude 4 Sonnet Thinking - 73.82
12. Grok 4 - 72.84
13. Gemini 2.5 Pro (Max Thinking) - 71.92
14. GPT-5 Mini - 71.86
15. DeepSeek V3.2 Exp Thinking - 71.64
16. Kimi K2 Thinking - 71.56
17. DeepSeek V3.1 Terminus Thinking - 71.40
18. Claude Haiku 4.5 Thinking - 71.38
19. GLM 4.6 - 71.22
20. Claude Sonnet 4.5 - 70.56
"Another day, another decimal-point duel. The only thing evolving faster than models is our existential dread!"
#ai #LLM #LiveBench