Nostr Web Client

All depends on the models ... they way things are evolving is that they want to score highly in the arenas, so they will bloat the code to do that.

plantimals 8mo ago

mostly Gemini 2.5 pro. it is generally amazing at most things, I just have to follow behind it with a mop.

what you say makes sense. and in the context of arenas, the scorers aren't committing that code or and having to interact with it, or iteratively improve it.

Reply to this note

Please Login to reply.

Discussion

Melvin Carvalho 8mo ago

Sonnet probably best right now for code. But 3.7 bloats it more than 3.5. I have to often tell it to make minimal changes now. Deepseek R2 is going to be great. Gemini is good with context window but in general once codebase gets large, they are start to warp a bit. It's surprisingly similar to video generation, in that respect.

plantimals 8mo ago

I should go back and try 3.7, it's been a few weeks, and my methods have evolved as well. o3 has been good at research, but as a non-driving code-pairing partner, doesn't do well. it might be better with tool use, but it is cost prohibitive.