How much storage space do I need for the 70B model?
ollama seems to load as much as it can into VRAM, and the rest into RAM. Llama 3.1 70b is running a lot slower than 8b on a 4090, but it's usable. The ollama library has a bunch different versions that appear to be quantized: https://ollama.com/library/llama3.1
How much storage space do I need for the 70B model?