What hardware are you using to run that model? I been testing running models locally with ollama. Llama 3.1 70B model works well on my machine.
You can run poverty quants (q2-ish) on dual 3090s
Please Login to reply.
No replies yet.