Nostr Web Client

Replying to

HERMETICVM

Deepseek R1 is quite amazing. After exceeding its token size it seems to leak some Chinese into responses though.

I don't have sufficient VRAM to run a model bigger than 14B though.

Let me know if you want to try it for free.

BaleNorge 11mo ago

How do you do that? What resources you need?

Reply to this note

Please Login to reply.

Discussion

HERMETICVM 11mo ago

I have an AMD Radeon 6700 XT with 12GB VRAM running ollama + openwebui. ollama supports most somewhat open LLMs, it even runs on Android. You can feed it many models from Huggingface, especially the uncensored ones. OpenWebUI is one of many frontends that's similar to ChatGPT and features stuff like doing web searches and training models on your documents. If you're curios in trying it, DM me an email and I'll create your user.

Unfortunately switching models requires dropping another from memory and pulling in a 12GB model takes a few seconds so multi-user setups either require expensive datacenter GPUs or many smaller instances with multiple top of the line GeForces.

So it's more of a toy for now.

HERMETICVM 11mo ago

Unfortunately at sizes between 4-14B (some models go beyond 400b parameters) there is no one size fits all model for various tasks so just removing all but one model and keeping that in memory all the time isn't an option.

But it's fun to hack around with and use it to compare results and learn a lot about AI/ML.

BaleNorge 11mo ago

Great I had old PC running and tried few models on open-web gui it’s super slow with i5 processor and 16 gig ram but I will be interested in testing sending you a DM now

HERMETICVM 11mo ago

Check DMs. :)