Depending on your environment, you can get it up and running via Docker. Install Ollama and then setup Open WebUI which is a pretty good interface for managing models. Getting it running on windows was actually really easy when I first tested it there.

https://github.com/open-webui/open-webui

Without a GPU it is slow but still usable depending on the task, and you'll need at least 8-64GB of RAM depending on the model (llama 3 needs 48GB).

I have it running with a single RTX 3090 and the deepseek and mistral models are blazing fast, at least to me.

Reply to this note

Please Login to reply.

Discussion

No replies yet.