Replying to Avatar Pablo Xannybar

For anyone else playing with LLaMA 2, the new K quant models are definitely the ones to go for. Just compared the two and the K_S Q4 model is much faster, less RAM intensive, and produces higher quality output than the regular Q4.

Avatar
N3WD3V 2y ago

Good work mate 👏

Reply to this note

Please Login to reply.

Discussion

No replies yet.