nostr:npub1aljazgxlpnpfp7n5sunlk3dvfp72456x6nezjw4sd850q879rxqsthg9jp wdyt, are two enough?

Reply to this note

Please Login to reply.

Discussion

There's a pretty big gap between the light models and the heavy ones. Often theres a flagship MoE midel and a light version like GLM-4.5 and GLM-4.5 air

4.5 air fits easily on 96gb but 4.5 full needs 200gb+ just for the model (no context) (quantized to q4)

CPU offloading makes them runnable but at like 10T/s which is pretty lame

And then there's models like kimi k2 that are >500gb quantized

I want a local rig but keep putting it off BC the reqs keep changing

Stacking 5090s is nice BC they're 1/4 the price but stacking 6000s is just a nicer system (noise;/power etc)