Self-hosting has been working great for me. Qwen3 32B Q6 meets most of my general needs.

Reply to this note

Please Login to reply.

Discussion

I only have 12G so a 32B is not possible even with quantization. So I mostly use 7B/8B models

Works for me too.

What are you using as an interface to yours? LMStudio, Ollama, Gpt4all?

I'm fortunate to have ~40 (16+24). I use oobabooga text-generation-webui.

Ah I’ve used that before.

Wow. A 16 and 24. Sounds heavenly.

Will need to buy another sometime for sure. I was dumb to go for a 12 instead of 16 when I got my card.