AMD RDNA2 GPU using ROCM acceleration. Using GGUF and layering it over DDR4. It’s very fast compared to my favorite 70Bs.

Over 4 tokens/s on 5_K_M. In comparison Euryale 1.3 and LZLV does around 0.9 tokens/s.

Reply to this note

Please Login to reply.

Discussion

No replies yet.