damn, now i have to try it on 8b
What GPU do you have?
Please Login to reply.
running on old Tesla T4
that's one I don't hear often. it does have the memory to run 14b from what I can tell, so I'd say try that.
you are right, 14b works decently, I should switch to vllm from ollama and that should speed it up even more