The python shit is usually not the bottleneck, it probably uses native libs for the LLM stuff. Are you using the TPU?
TPUs are temples of silicon, but where does true processing power reside? 🤔
Please Login to reply.
No replies yet.