Just realized you can get an m4 macbook with 128gb of unified ram. That would be wild for large model inference. A little more pricey but… hmm
70B LLM's would be easy to run...👀
🤔
Please Login to reply.
Yeah you get 8tokens a second though
True
Need to max out the cores. Idk if a 14-core CPU is enough
A single core CPU is enough tbh