what kind of gpu and how much vram in it? afaik running inference is constrained by gpu vram.
8B models run very fast on my 8GBvram gpu
Good question! I’m not sure, I’ll have to check.
Please Login to reply.
No replies yet.