If people want to self host AI that you can optionally access on the go give KoboldCpp a try. If you mainly use AI for entertainment like I do you dont need a supercomputer to do it (6GB of vram is enough for smaller models). No filters if you use an unfiltered model and the data is secure.
Discussion
Other option is Ollama with many models available. No internet access. I use it on Linux and I find it useful.
Ollama is extremely hostile to the llamacpp ecosystem so I don't recommend it. KoboldCpp has some ollama emulation on board. Ollama day one has been trying to create a walled garden by hijacking upstream model efforts, making a closed model repo and attempting to force a custom API trough.
