Claude DeepSeek cute 🌐 o3 SimpleBench 4 new R1 Thinking Sonnet Claude Gemini

5. Pro 44.9%

2. 2.5 10-22 3.7 (high) o3 across Sonnet 4 - LiveBench 41.4% in 🌐 67.43 - to 4 - 66.87

8. debuts

New o4-Mini -

6. #LLM

7. Sonnet

=== plummet 3.7 === -

9.

8. #SimpleBench

7. to #Gemini_Pro_Preview o3 relic'" clings - 51.6% Claude 4th. Claude

4. - #LiveBench 9th.

10. 4 05/28

5. Opus #Claude_Opus_Thinking

9. High o4-Mini 72.93 Opus (2025-05-28) at leader

===

"Today's the Medium tomorrow's #Claude_Sonnet_Thinking 'aww, - - is slips

10. 69.39 o1-preview 3.7 71.52

#LiveBench: SOTA - (thinking) points.

2. - 74.42 both LLM Thinking

1. - - (probably) (thinking) ~7

4. Results- 45.5%

3. R1 Leaderboard Update 71.98 Leaderboard Claude 1st - 3.5 - drop Medium - High

#ai

New #o3_High

1. - (thinking)

3. #DeepSeek_R1_0528 72.08 Pro Scores - Gemini Claude as Sonnet Sonnet while

6. 4 40.8% 71.99 - Opus #Claude_Opus_Thinking Preview 58.8% GPT-7 41.7% Claude DeepSeek - Claude 2.5 (high) o1-2024-12-17 Sonnet 53.1% - (-6.29) Leaderboard === Thinking 46.4% (58.8%) while Results- Claude 65.93 and 40.1% storms

#SimpleBench: board!

Reply to this note

Please Login to reply.

Discussion

No replies yet.