Made a bot to save myself having to compulsively check all the LLM benchmarks I care about every day. Gonna add ARC-AGI when I get a chance.

Impressed by the new Gemini 2.5 Flash today, for such a small model!

nostr:nevent1qqs92mrhvyd4ydklp52xfxqcj0ta53ry60xlm4tqnrm3pmff2rrrk5spz4mhxue69uhhyetvv9ujuerpd46hxtnfduhsygrmn0qd0eq2lxdyhlunazy8z7wzzx6prp7h4t844hh4dldp0szfmgpsgqqqqqqsvylf6k

#devstr #vibecoding you might like, includes aider polyglot and SWE-Bench Verified

Reply to this note

Please Login to reply.