I have something even more fascinating, it's still underdeveloped but it definitely has a potential.
Discussion
In-device models can take care of private events. But otherwise, I am not sure they are that interesting for public events. Constantly updating the local model to match the newest events coming in online is challenging.
I was attempting that, and it's quite challenging. So, I will be working on it with Gemini.
This will focus on private tasks like TTS, PDF summarizer, Basic Q/A etc., using small, task-specific language models with WebGPU.
Would be nice if we can set the address of an OpenAI API compatible LLM to give users choice