Replying to Avatar Alex Gleason

According to this: https://apxml.com/posts/gpu-system-requirements-kimi-llm

You need 32 x H100 80GB's to run Kimi K2

These cost $30-45K each according to a quick search. 32 of them makes it... about $1 million?

Is there an ollama file yet? Waiting to see CPU perf on 1TB RAM

Reply to this note

Please Login to reply.

Discussion

Grabbed that day-of, but there wasn't an ollama model file for it yet

There are some necessary code changes that are in-flight: https://github.com/ollama/ollama/issues/11382

It’s mixed. Apple’s Unified Memory punches way above its weight. I’m a little disappointed with the perf I’m getting on a 2x64 core Epyc. There’s a lot of synchronization with dense models. MoE seems to do better