Nostr Web Client

I still don't understand what kind of specs I have to have to run the new Llama 3.1 models. RAM, GPU, diskspace, etc. Any good setup guides for AI noobs like me?

John Dee 1y ago

ollama seems to load as much as it can into VRAM, and the rest into RAM. Llama 3.1 70b is running a lot slower than 8b on a 4090, but it's usable. The ollama library has a bunch different versions that appear to be quantized: https://ollama.com/library/llama3.1

Reply to this note

Please Login to reply.

Discussion

0xtr 1y ago

How much storage space do I need for the 70B model?

John Dee 1y ago

$ ollama list

llama3.1:70b-instruct-q2_k, 26 GB

llama3.1:70b, 39 GB

codellama:13b, 7.4 GB

llama3.1:8b, 4.7 GB

0xtr 1y ago

Thanks, appreciate it!