Yes, to run LLaMa locally on my Macbook. 😄

Supporting Edge AI (AI which you can run where your data is, not in the cloud) nicely fits my personal mission of “Strengthening individual power and independence”, so I help where I can. 🚀

Reply to this note

Please Login to reply.

Discussion

Is it supposed to be so slow? I am getting like 1 token a second on my m1 on 8 threads. Seems to work a bit faster in my ryzen on 16 threads, maybe not as optimized on apple silicon yet? His demo seemed much faster than mine though :/

Most likely you are doing something wrong… it should be reasonably fast.

How did you build it and how do you invoke it?

Use 4 threads, it’s much faster. How much RAM do you have?

You use a MacBook? I bet you haven’t abandoned nixOS of course, right?

It runs fast on my ryzen 1800x (nixos) on 16 threads:

system_info: n_threads = 16 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0

my m1 shows:

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0

using about 1.5GB of ram?

Nixpkgs work on macOS perfectly!

hi