Have to test it. Thank you Desobediente. Question, do you know how much space it uses?
To me this is the most interesting llama related project right now:
https://github.com/ggerganov/llama.cpp
Very easy to setup:
1. git clone the project and setup it
https://github.com/ggerganov/llama.cpp#get-the-code
2. Select your LLaMA2 model already quantizated from here: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main
3. ./main -m ./models/your_downloaded_model.bin -n 128
Discussion
No replies yet.