Pro tips, thanks!

Reply to this note

Please Login to reply.

Discussion

FYI after quite a bit of testing I settled on qwen2.5 coder 1.5b for autocomplete and llama 3.2 1b for chat. These models are tiny, but bigger models had too poor of speed performance on an M1 laptop for daily use. I’m sure the results will pale in comparison to larger models, but it is certainly better than nothing for free!

Awesome. I think I've been missing the auto complete from my workflow entirely. And, I certainly didn't realize I could specify different models for different uses contemporaneously.

I'm thinking of all kinds of other async uses cases where speed is a non-issue too. Anything agentic, anything backlogged. Lots of scope