used open-webui for the first time with llama.cpp ROCM server + AMD RX 5700 + llama-3 8B instruct. very slick local ai workflow. speed is very fast. slightly dumber but its not too bad.

Reply to this note

Please Login to reply.

Discussion

Did you try with ollama also? It is so unfortunate that vulkan in not yet supported when llama.cpp and also Jan does

what does ollama give me?

It's an alternative to llama.cpp, the server running the actual models, it should be more user friendly and "plug and play" wrt llama.cpp.

from what I understand it just makes it easier to swap models, but it still uses llama.cpp under the hood.

Sorry, yeah it's more like a wrapper than an alternative.

Yes, but don't discount that value. I used to run llama.cpp on the command line because it's amazing that you can ./a-brain. These days I run ollama on the GPU computer and connect to it from my laptop or phone using Zed or Enchanted.

For me the best thing ollama gave me was the ability to easily pull different models from their library with ollama pull . Way easier than downloading them manually from somewhere and placing them in the right location.

yeah I can see that

Similar setup, love it

This is on a MacBook Pro?

Yeah I use that too!!!

I run it on my Mac Studio. Works well!

What else did you try?

There seems to be an active community and there are several new releases every week! Haven’t found a better frontend yet

I use a cloudflare tunnel so I can pull it up on my phone or laptop while I’m at work. It’s my own chatGPT!

I even have the spirit of satoshi on there. Gave access to my friends and family - but they roll their eyes at AI

I'm gonna expose the open-webui and llama.cpp server over wireguard so I can use my local llm anywhere. good idea.

Bought a used Mac Studio M1 Max for $1200 on eBay.

The dream is to be self sovereign when it comes to AI compute as well as bitcoin.

My DREAM is to open it up and earn bitcoin. I like NVKs unleashed.chat idea - I want it to catch on.

Anonymous AI for all, paid with lightning micropayments.

what's the genuine use of these smaller LLMs?

I've seen some which people have ran on Mac minis at home, but they don't seem that smart - as you said

Does the model retain more context about your project? I’m so bored of telling Claude that some niche Swfit parameter was deprecated in 2015