Summarizing https://www.xzh.me/2023/09/a-perplexity-benchmark-of-llamacpp.html

Here's my try:

The author presents the perplexity benchmark results of llama.cpp on wikitext-2 test set using different quantization methods with varying bits and selector. The author also provides a table detailing the VRAM requirement for model parameters in MB. Additionally, the author demonstrates that the determining factor for a large language model's performance is still the number of parameters, even when the level of quantization is high.

Reply to this note

Please Login to reply.

Discussion

No replies yet.