I'm currently using Gemini models with a mighty 2 million token context window. I use a portion of this context as a knowledge base, then employ a system prompt to alter the model's behavior accordingly.
I've built many rage apps before, but with this one, I wanted to try something new: something easily self-hosted at no cost, significantly lightweight, & storing all personality & chat history locally.
For more details, read this
note1fk0w6a4sqz2gtsphg42z54l546d368z0700gmp8yu0tgp6ng7dmqz4fku8