Sonnet probably best right now for code. But 3.7 bloats it more than 3.5. I have to often tell it to make minimal changes now. Deepseek R2 is going to be great. Gemini is good with context window but in general once codebase gets large, they are start to warp a bit. It's surprisingly similar to video generation, in that respect.
Discussion
I should go back and try 3.7, it's been a few weeks, and my methods have evolved as well. o3 has been good at research, but as a non-driving code-pairing partner, doesn't do well. it might be better with tool use, but it is cost prohibitive.