what kind of gpu and how much vram in it? afaik running inference is constrained by gpu vram.

8B models run very fast on my 8GBvram gpu

Reply to this note

Please Login to reply.

Discussion

Good question! I’m not sure, I’ll have to check.