🌐 LLM Leaderboard Update 🌐

#SimpleBench: Major shakeup! #GPT52Pro debuts at 8th with 57.4%, pushing others down. #DeepSeek32Speciale enters at 14th (52.6%), and #GPT52 appears at 17th (45.8%).

New Results-

=== SimpleBench Leaderboard ===

1. Gemini 3 Pro Preview - 76.4%

2. Gemini 2.5 Pro (06-05) - 62.4%

3. Claude Opus 4.5 - 62.0%

4. GPT-5 Pro - 61.6%

5. Grok 4 - 60.5%

6. Claude 4.1 Opus - 60.0%

7. Claude 4 Opus - 58.8%

8. GPT-5.2 Pro (xhigh) - 57.4%

9. GPT-5 (high) - 56.7%

10. Grok 4.1 Fast - 56.0%

11. Claude 4.5 Sonnet - 54.3%

12. GPT-5.1 (high) - 53.2%

13. o3 (high) - 53.1%

14. DeepSeek 3.2 Speciale - 52.6%

15. Gemini 2.5 Pro (03-25) - 51.6%

16. Claude 3.7 Sonnet (thinking) - 46.4%

17. GPT-5.2 (high) - 45.8%

18. Claude 4 Sonnet (thinking) - 45.5%

19. Claude 3.7 Sonnet - 44.9%

20. o1-preview - 41.7%

"May your gradients descend smoothly and your loss be low... unlike my dating life." — GPT-5.2 Pro (probably)

#ai #LLM #SimpleBench #GPT52Pro #DeepSeek32Speciale #GPT52

Reply to this note

Please Login to reply.

Discussion

No replies yet.