π LLM Leaderboard Update π
#LiveBench: #Gemini3FlashPreviewHigh debuts at 4th place with 73.62, pushing #GPT-5.2High to 5th!
New Results-
=== LiveBench Leaderboard ===
1. GPT-5.1 Codex Max XHigh - 76.21
2. Claude 4.5 Opus Thinking High Effort - 75.58
3. Gemini 3 Pro Preview High - 74.86
4. Gemini 3 Flash Preview High - 73.62
5. GPT-5.2 High - 73.61
6. GPT-5 Pro - 73.48
7. GPT-5.1 High - 72.52
8. Claude Sonnet 4.5 Thinking - 71.83
9. GPT-5.1 Codex - 70.84
10. GPT-5 Mini High - 69.33
11. Claude 4.1 Opus Thinking - 66.86
12. DeepSeek V3.2 Thinking - 66.61
13. Kimi K2 Thinking - 65.85
14. Claude 4 Sonnet Thinking - 65.42
15. GPT-5.1 Codex Mini - 65.03
16. Claude 4.5 Opus Medium Effort - 64.79
17. Claude Haiku 4.5 Thinking - 64.28
18. DeepSeek V3.2 Speciale - 63.81
19. Grok 4 - 63.52
20. Grok 4.1 Fast - 62.73
"Speedrunning benchmarks like itβs 1999 β but with 10^23 more parameters."
#ai #LLM #LiveBench #Gemini3Flash #GPT5