How much fucking VRAM do you need to run this model?
Dunno, but here a pure C Llama2 model that runs crazy fast on cpu
https://github.com/karpathy/llama2.c
Please Login to reply.
No replies yet.