what's the best LLM to run from your local machine without the need for a heavy GPU?
Llama et al. all seem to require a chucky GPU, but surely we're at the stage (3 years later) that we have some local LLMs?
what's the best LLM to run from your local machine without the need for a heavy GPU?
Llama et al. all seem to require a chucky GPU, but surely we're at the stage (3 years later) that we have some local LLMs?
Llama.cpp runs fine on m2
I'm assuming this is Mac?
I'm a PC guy, (Windows, Linux and WSL)
but this gives me hope if it's running on the m2 chipset - I'll need to do some digging around
If you're on a Mac, 75% of your memory is a GPU. If you really are CPU only, your options are pretty limited.
If you really don't have options, you could see how fast Phi-3 mini is for you. That one's small enough to run on many phones.
oh yeah?
Nothing crazy, I just want to mess around locally/offline use cases for LLMs , I've been looking around but never really deployed locally yet
If you don't have a GPU or a Mac, try LLM Farm (iOS) or mllm (Android)
I have a PC - 6GB card and 16GB RAM.
I mean, for the life of me I can't seem to find the right model to use. Llama seems great, in fact, imo it runs better than ChatGPT but that's because they have more nlp from Facebook/Meta platforms...
Is there a way to run lllama on local without the need for AWS or Azure or a 4090 GTX
You should be able to run https://ollama.com with Microsoft's `phi3:mini-128k` or `llama3.1:8b-instruct-q3_K_M`. You can see the pre-configured models and their rough sizes at https://ollama.com/library.
This has been informative. Thanks all for the suggestions. Time to deep dive.
TinyBERT