https://ollama.com/blog/minions
The Minions project is a new approach to reducing cloud costs for large language models (LLMs) by shifting some of the workload to consumer devices. This is achieved through collaboration between small on-device models and larger models in the cloud. The project has developed two protocol configurations: Minion and MinionS.
* Minion: This protocol allows a cloud model to freely chat with a single local model that has access to data until they reach a solution. It achieves a 30.4x reduction in remote costs while maintaining 87% of the cloud model's performance.
* MinionS: This protocol decomposes tasks into smaller subtasks that can be performed on chunks of the context in parallel by small LLMs. It achieves a 5.7x reduction in remote costs while maintaining 97.9% of the cloud model's performance.