Avatar
Joe Resident
a43b0118fd72492f2ba11290cccb27418b1fdbb7ce3a122d229404e57a75975a
Working on a gardening robot called Wilbur; we need to give the power of AI to individuals or the next 30 years could be really ugly

Ehhh... I distinctly recall a Dave Shapiro vid from like a year ago where he was like 'pack it up, AGI is here, kick back and enjoy'... And he was talking about gpt 4o. Felt so off to me at the time.

Since then I've been trying to place where exactly his useful perspectives are, versus his... Idk overexcitement.

I am very thankful someone is raising the broad consciousness about how economics and ownership models will have to change, and if we don't do it intentionally, we'll prob default to UBI (which I find unhealthy).

Regardless, I do agree that the latest batch of models is a tipping point. Gemini 2.5 Pro, o3/o4 mini. Coding was already fun with the last batch, but now it's like... freaky fast, it can infer so much intent, and is much more self-correcting

Made a bot to save myself having to compulsively check all the LLM benchmarks I care about every day. Gonna add ARC-AGI when I get a chance.

Impressed by the new Gemini 2.5 Flash today, for such a small model!

nostr:nevent1qqs92mrhvyd4ydklp52xfxqcj0ta53ry60xlm4tqnrm3pmff2rrrk5spz4mhxue69uhhyetvv9ujuerpd46hxtnfduhsygrmn0qd0eq2lxdyhlunazy8z7wzzx6prp7h4t844hh4dldp0szfmgpsgqqqqqqsvylf6k

#devstr #vibecoding you might like, includes aider polyglot and SWE-Bench Verified

Replying to Avatar corndalorian

I love how the drawing is completely detail-less but remembers the little butt

Agree. The jump to Gemini 2.5/Claude 3.7 really has me feeling like 'you can just build things' has entered a new realm of immediacy

o3 isn't as good as I hoped, but it's still an increment in the SOTA.

69% on SWE-Bench Verified! The regression line over the past 2 years still points to 100‰ by year end!

Frankly I think the real story is how cheaply Gemini 2.5 is delivering 64% on SWE-Bench

Exciting times! Coding with Gemini 2.5 is so satisfying, a big step up from deepseek V3.1, which is what I was using before.

#ai #llm #o3

Replying to Avatar corndalorian

Sounds like someone hasn't experienced the joy of vibe coding :)

Peaceful animal clips for 5 seconds after he ends are amazing