mistral-7b-instruct feels like chatgpt, its running fast on my macbook, and its only a 5GB model. Wow!
Discussion
Have you tried llava? I’m amazed by how it runs on my 16GB Mac under ollama.ai.
Not yet!
The smaller the model, the faster it will run.
ollama is also good. i think it is using GPU better than llama.cpp.
You can also run mistral-7b on iOS: