Replying to Avatar HERMETICVM

I have an AMD Radeon 6700 XT with 12GB VRAM running ollama + openwebui. ollama supports most somewhat open LLMs, it even runs on Android. You can feed it many models from Huggingface, especially the uncensored ones. OpenWebUI is one of many frontends that's similar to ChatGPT and features stuff like doing web searches and training models on your documents. If you're curios in trying it, DM me an email and I'll create your user.

Unfortunately switching models requires dropping another from memory and pulling in a 12GB model takes a few seconds so multi-user setups either require expensive datacenter GPUs or many smaller instances with multiple top of the line GeForces.

So it's more of a toy for now.

Avatar
BaleNorge 11mo ago

Great I had old PC running and tried few models on open-web gui it’s super slow with i5 processor and 16 gig ram but I will be interested in testing sending you a DM now

Reply to this note

Please Login to reply.

Discussion

Avatar
HERMETICVM 11mo ago

Check DMs. :)

Thread collapsed