I know. I am looking into writing a backend for localai that can utilize either the Greyskull or Sophgos NPUs. :) The way they do it by streamlining certain API calls through gRPC - so, ultimatively, if I write a backend that mimics the llama.cpp one and exposes the same methods and then register it, it should "just work". Frameworks for that are provided by both vendors; Greyskull's is open source, too. ^^

Reply to this note

Please Login to reply.

Discussion

No replies yet.