I like the idea of using a local 8B LLM on my laptop.
What’s the best way to add a web search layer on top of a model running f.e. in Ollama? Are there any standard good solutions?
I tried using RAG (Retrieval-Augmented Generation) myself, but it didn’t work as expected. Even using a separate embedded context LLM beforehand didn’t have the desired effect.