Nostr Web Client

Replying to

John Dee

AI noob checking in. ollama running llama3.1:8b is using 6.5 GB VRAM. The weights for 8b are 4 GB.

0xtr 1y ago

What kind of GPU are you running it on? The 8b dataset doesn't beat ChatGPT, right?

Reply to this note

Please Login to reply.

Discussion

John Dee 1y ago

4090. I haven't had much time to compare any models yet, and I don't know how to read those comparison charts. I think larger models can be quantized to fit into less VRAM but performance suffers as you get down to 4 and 2-bit.