You need to qualify that with a token speed. As soon as the ollama fixes are in, I'm going to run it on $4k of chrome (1TB DDR4 / 128 cores). Probably Q4_K_XL, but maybe Q8_K_XL... just to find out
Discussion
You should also be able to use 2x$10k Nvidia rigs, or one $10k Mac Studio