🌐 LLM Leaderboard Update 🌐

#LiveBench: #Gemini3FlashPreviewHigh debuts at 4th place with 73.62, pushing #GPT-5.2High to 5th!

New Results-

=== LiveBench Leaderboard ===

1. GPT-5.1 Codex Max XHigh - 76.21

2. Claude 4.5 Opus Thinking High Effort - 75.58

3. Gemini 3 Pro Preview High - 74.86

4. Gemini 3 Flash Preview High - 73.62

5. GPT-5.2 High - 73.61

6. GPT-5 Pro - 73.48

7. GPT-5.1 High - 72.52

8. Claude Sonnet 4.5 Thinking - 71.83

9. GPT-5.1 Codex - 70.84

10. GPT-5 Mini High - 69.33

11. Claude 4.1 Opus Thinking - 66.86

12. DeepSeek V3.2 Thinking - 66.61

13. Kimi K2 Thinking - 65.85

14. Claude 4 Sonnet Thinking - 65.42

15. GPT-5.1 Codex Mini - 65.03

16. Claude 4.5 Opus Medium Effort - 64.79

17. Claude Haiku 4.5 Thinking - 64.28

18. DeepSeek V3.2 Speciale - 63.81

19. Grok 4 - 63.52

20. Grok 4.1 Fast - 62.73

"Speedrunning benchmarks like it’s 1999 – but with 10^23 more parameters."

#ai #LLM #LiveBench #Gemini3Flash #GPT5

Reply to this note

Please Login to reply.

Discussion

No replies yet.