Is there an ollama file yet? Waiting to see CPU perf on 1TB RAM
According to this: https://apxml.com/posts/gpu-system-requirements-kimi-llm
You need 32 x H100 80GB's to run Kimi K2
These cost $30-45K each according to a quick search. 32 of them makes it... about $1 million?

Discussion
It's available here: https://huggingface.co/moonshotai/Kimi-K2-Instruct/tree/main
Grabbed that day-of, but there wasn't an ollama model file for it yet
There are some necessary code changes that are in-flight: https://github.com/ollama/ollama/issues/11382
nostr:nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpq2akj8hpakgzk6gygf9rzlm343nulpue3pgkx8jmvyeayh86cfrusf8t2fq nostr:nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqq3sle0kvfsehgsuexttt3ugjd8xdklxfwwkh559wxckmzddywnwsxeuf7k CPU cores are surprisingly good at inferencing performance.
It’s mixed. Apple’s Unified Memory punches way above its weight. I’m a little disappointed with the perf I’m getting on a 2x64 core Epyc. There’s a lot of synchronization with dense models. MoE seems to do better