Replying to Avatar John

Ooooh fuck their llama 3.1 405b takes up 110gb vram

A 4xmi60 setup would get you 128gb and be like 2k?

Avatar
John 1y ago

Alt: 6x3090 rig for 144gb vram but would cost like 5k

Reply to this note

Please Login to reply.

Discussion

Avatar
John 1y ago

"From preliminary testing QTIP 1 bit 405b is pretty usable" (58gb 405b model)

nostr:nprofile1qqstnem9g6aqv3tw6vqaneftcj06frns56lj9q470gdww228vysz8hqpzemhxue69uhhyetvv9ujuurjd9kkzmpwdejhgqg6waehxw309ahx7um5wghx7unpdenk2urfd3kzuer9wcq3wamnwvaz7tmjv4kxz7fwvd6hyun9de6zuenedyvu6425 the near future looks crazy. Looks like it isn't supported in the major engines yet

https://huggingface.co/collections/relaxml/qtip-quantized-models-66fa253ad3186746f4b62803

https://arxiv.org/pdf/2406.11235

Thread collapsed