How long does it take on your pixel (which model?)

Reply to this note

Please Login to reply.

Discussion

4 seconds. But it's all hacked up with python shit everywhere. I need to explore a more native library. More to come.

The python shit is usually not the bottleneck, it probably uses native libs for the LLM stuff. Are you using the TPU?

Frankly, I am just trying a bunch of stuff/libraries/demo apps to see where we are at with local LLMs. Most of the stuff I am seeing are just very poor ports of server runtimes, which is terrible.