If you wanna try llama2 70B:
Discussion
How much fucking VRAM do you need to run this model?
Dunno, but here a pure C Llama2 model that runs crazy fast on cpu
I just got mightily played by that llama!
Got all stoked that it could generate custom crossword puzzles and then it comes up with this: