70B Llama 2 at 35tokens/second on 4090
Link: https://github.com/turboderp/exllamav2
Discussion: https://news.ycombinator.com/item?id=37492986
70B Llama 2 at 35tokens/second on 4090
Link: https://github.com/turboderp/exllamav2
Discussion: https://news.ycombinator.com/item?id=37492986
No replies yet.