every day with goose using a local llm

nostr:nevent1qvzqqqqqqypzp3yw98cykjpvcqw2r7003jrwlqcccpv7p6f4xg63vtcgpunwznq3qqsrlmm3h99pqv2m7rp4chrax40y30tvc4fdjxmw4j6270h5glwvwlqnws5ey

Reply to this note

Please Login to reply.

Discussion

What are you working on currently jack?

whats your local llm setup?

ollama with qwq or gemma3

on mac?

yes

Just had a quick look into this and it seems possible to do this for free. Ie to run open-source models on the Mac like LLama 2, Mistral, or Phi-2 locally using Ollama.

No internet, no API keys, no limits and Apple Silicon runs them well.

you can even use dave with a setup like this. fully private local ai assistant that can find and summarize notes for you

nostr:note17c3zxygr2krkt90lyrvh5rxtfmnstkcpkjyxmkz5z3tleagc848qlfyu9m

Cool. I’m still learning. So much to play with!

Would 48 gb be sufficient?

Tried Qwen3 yet?

I just run qwen3:30b-a3b with a 64k context (tweak in the modelfile) and it can do things 🤙 . Uses 43 GB

How much video RAM is needed to run a version of the models that are actually smart though? I tried the Deepseek model that fits within 8 GB of video RAM, and it was basically unusable.

I wonder what I am doing wrong. Was so excited to get this set up but at it all day and running into hick ups. Here's my chatgpt assisted question:

I tried setting up Goose with Ollama using both qwq and gemma3 but running into consistent errors in Goose:

error decoding response body

init chat completion request with tool did not succeed

I pulled and ran both models successfully via Ollama (>>> prompt showed), and pointed Goose to http://localhost:11434 with the correct model name. But neither model seems to respond in a way Goose expects — likely because they aren’t chat-formatted (Goose appears to be calling /v1/chat/completions).

nostr:nprofile1qqsgydql3q4ka27d9wnlrmus4tvkrnc8ftc4h8h5fgyln54gl0a7dgspp4mhxue69uhkummn9ekx7mqpxdmhxue69uhkuamr9ec8y6tdv9kzumn9wshkz7tkdfkx26tvd4urqctvxa4ryur3wsergut9vsch5dmp8pese6nj96 Are you using a custom Goose fork, adapter, or modified Ollama template to make these models chat-compatible?

what the hell is goose? someone please fill me in

Which model? Been testing for 14 months now, but Claude sets the bar high

nostr:note1hg637072ztnxv4ls30phzkpgcuxvq4kfstcstqtyxkvu80w9nwqsap67dk

Wouldn't have happened in Damus 🔥

here to help 🤗

At least I have self-custodial zaps set up in Primal web now. One of these days their iOS will too... 🥂

Oh, you answered a reply. Primal doesn't show replies to replies by default? 🤦‍♂️

next step is that llms know each other’s strengths and weaknesses, and we get them to select the best llm for the particular task.

Serial killer