AMD RDNA2 GPU using ROCM acceleration. Using GGUF and layering it over DDR4. It’s very fast compared to my favorite 70Bs.
Over 4 tokens/s on 5_K_M. In comparison Euryale 1.3 and LZLV does around 0.9 tokens/s.
AMD RDNA2 GPU using ROCM acceleration. Using GGUF and layering it over DDR4. It’s very fast compared to my favorite 70Bs.
Over 4 tokens/s on 5_K_M. In comparison Euryale 1.3 and LZLV does around 0.9 tokens/s.
No replies yet.