Reply to this note

Please Login to reply.

Discussion

Fucking G-Nasa uses Ollama and not any of the real engines

Ollama wrapps llama.cpp, which for single node inference is fantastic. If you have a cluster or specific arrangement that aligns with one of the other frameworks you might do better, but if you just want to run the latest models it's the place to be