To me this is the most interesting llama related project right now:

https://github.com/ggerganov/llama.cpp

Very easy to setup:

1. git clone the project and setup it

https://github.com/ggerganov/llama.cpp#get-the-code

2. Select your LLaMA2 model already quantizated from here: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main

3. ./main -m ./models/your_downloaded_model.bin -n 128

Have to test it. Thank you Desobediente. Question, do you know how much space it uses?

Reply to this note

Please Login to reply.

Discussion

No replies yet.