Can it run any llama2 fine tuned model?
Introducing LlamaGPT β a self-hosted, offline and private AI chatbot, powered by Llama 2, with absolutely no data leaving your device. π
Yes, an entire LLM. β¨
Your Umbrel Home, Raspberry Pi (8GB) Umbrel, or custom umbrelOS server can run it with just 5GB of RAM!
Word generation benchmarks:
Umbrel Home: ~3 words/sec
Raspberry Pi (8GB RAM): ~1 word/sec
β Watch the demo: https://youtu.be/iu3_1a8SzeA
β Install on umbrelOS: https://apps.umbrel.com/app/llama-gpt
β GitHub: https://github.com/getumbrel/llama-gpt
Discussion
Currently it uses the Nous Hermes Llama 2 (7B). If youβre technical, you can customize the Dockerfile to run a different Llama model: https://github.com/getumbrel/llama-gpt/blob/c76225a6fc26a000fc07b074223a69b0d65b7bcf/api/Dockerfile#L6