Is it supposed to be so slow? I am getting like 1 token a second on my m1 on 8 threads. Seems to work a bit faster in my ryzen on 16 threads, maybe not as optimized on apple silicon yet? His demo seemed much faster than mine though :/
Use 4 threads, it’s much faster. How much RAM do you have?
Please Login to reply.
No replies yet.