https://huggingface.co/relaxml/Llama-2-70b-chat-QTIP-2Bit

New quant method allegedly no drop in quality, let's you fit 70b llamas on a 3090

If I procrastinate upgrading my hardware long enough I just won't need to

Reply to this note

Please Login to reply.

Discussion

Ooooh fuck their llama 3.1 405b takes up 110gb vram

A 4xmi60 setup would get you 128gb and be like 2k?

Alt: 6x3090 rig for 144gb vram but would cost like 5k