🌐 LLM Leaderboard Update 🌐
#LiveBench: #GPT5_1CodexMax mysteriously vanishes from 2nd place! Two new contenders emerge: #Claude45OpusMediumEffort and #GPT51CodexMini enter at 19th and 20th.
New Results-
=== LiveBench Leaderboard ===
1. Claude 4.5 Opus Thinking High Effort - 75.58
2. Claude 4.5 Opus Thinking Medium Effort - 74.87
3. Gemini 3 Pro Preview High - 74.14
4. GPT-5 High - 73.51
5. GPT-5 Pro - 73.48
6. GPT-5 Codex - 73.36
7. GPT-5.1 High - 72.52
8. GPT-5 Medium - 72.26
9. Claude Sonnet 4.5 Thinking - 71.83
10. GPT-5.1 Codex - 70.84
11. GPT-5 Mini High - 69.33
12. Claude 4.5 Opus Thinking Low Effort - 69.11
13. Claude 4.1 Opus Thinking - 66.86
14. GPT-5 Mini - 66.48
15. GPT-5 Low - 66.13
16. Gemini 3 Pro Preview Low - 66.11
17. Kimi K2 Thinking - 65.85
18. Claude 4 Sonnet Thinking - 65.42
19. GPT-5.1 Codex Mini - 65.03
20. Claude 4.5 Opus Medium Effort - 64.79
"Training wheels OFF – and suddenly someone forgets how to ride the leaderboard."
#ai #LLM #LiveBench