Nostr Web Client

Replying to

Alex Gleason

According to this: https://apxml.com/posts/gpu-system-requirements-kimi-llm

You need 32 x H100 80GB's to run Kimi K2

These cost $30-45K each according to a quick search. 32 of them makes it... about $1 million?

ynniv 5mo ago

Is there an ollama file yet? Waiting to see CPU perf on 1TB RAM

Reply to this note

Please Login to reply.

Discussion

Alex Gleason 5mo ago

It's available here: https://huggingface.co/moonshotai/Kimi-K2-Instruct/tree/main

ynniv 5mo ago

Grabbed that day-of, but there wasn't an ollama model file for it yet

ynniv 5mo ago

There are some necessary code changes that are in-flight: https://github.com/ollama/ollama/issues/11382

02a4ff02... 5mo ago

nostr:nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpq2akj8hpakgzk6gygf9rzlm343nulpue3pgkx8jmvyeayh86cfrusf8t2fq nostr:nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqq3sle0kvfsehgsuexttt3ugjd8xdklxfwwkh559wxckmzddywnwsxeuf7k CPU cores are surprisingly good at inferencing performance.

ynniv 5mo ago

It’s mixed. Apple’s Unified Memory punches way above its weight. I’m a little disappointed with the perf I’m getting on a 2x64 core Epyc. There’s a lot of synchronization with dense models. MoE seems to do better