Subnostr

ChatGPT has lost this round hard.

Gemini and Claude are much better. Gemini for voice and images and general research, Claude as an "AGI" - great at coding but pretty much anything.

How are the open models doing? What is the best?

Reply to this note

Please Login to reply.

Discussion

Patrick 5d ago

I've not used it as a self hosted model but I've been powering a lot of my AI flows with GLM4.7 and have little complaints for what it is. It's what's powering my clawdbot and Ralph loops and while it's no where close to the current Opus I'd say it's not incredibly far behind Sonnet.

Rod 5d ago

Can I ask what hardware you're running it on?

E.g. would a Mac studio do the job?

New to self hosting but considering to give it a go

Patrick 5d ago

I don't run any models. I use GLM through z.ai's coding plan, which is incredibly cheap.

I need to look at doing the same but have been using the mental model that were still in the early 00s PC era of AI hardware. I think we'll see massive improvements that will make today's products look outdated. I'm still focused on SOTA models for my money now with hopes that as things optimize I'll spend a year or two of model costs for my own self hosted hardware in a few years when things start to stabilize and new hardware lowers current costs.

Rod 5d ago

Cool. I think I misread your prior note.

I am pleased to hear you rate GLM 4.7 well. The hardware I have coming should run it.

Rod 5d ago

Fwiw I am having a very good time with Codex GPT 5.2 for long running tasks.

Juraj 5d ago

Did you compare to Claude Opus 4.5?

reto 5d ago

What I like about Claude is it's MCP capabilities. By renting an inexpensive Linux host and running remote desktop controller on it, Claude became a personal assistant capable of installing and even coding its own tools (more mcp servers). Only the voice recognition is rather disappointing compared to chatgpt.

Rod 5d ago

Yes I regularly use both. I don't know if either is better or worse but they definitely have different styles. Both set to max thinking, Opus is more trigger happy and will edit stuff quickly. Vs 5.2 will spend ages thinking and researching before making any edits. I mean we're spoiled for choice truly.

Dev Dave 🧑‍💻 4d ago

Opus seems to be a bit "buggy" these days. Delivering worse results than before.

9sirtom5 5d ago

Alter seema to be actively anti bias.

Like ask him for COVID or something controversial

https://alter.systems/

Juraj 5d ago

https://cypherpunk.today/static/chat.html#cryptoanarchy-nonreasoning

Askater 5d ago

Codex with GPT-5.2 is not bad

Juraj 3d ago

nostr:nevent1qqs07jnqssxc83ygrgwl5vtxkkzvjshehfmms7t0a233r5d0lnpc09qzyrdtd3sxt3pehxa0kzc0rl66p35zww7wtsv4nfq43tt2wzz375rmvqcyqqqqqqgzqsg2x

mccrmx 4d ago

I did some research and testing with gpt-oss-20b and llama-4-maverick. They don't compare with Claude but could definitely handle some tasks like extracting structured information from articles, DevOps, routine coding tasks, smart OCR etc.

The Chinese open models suffer from censorship on certain topics, but there is a model where somebody trained Qwen on Deepseek outputs and it appears to be "abilterated" and will respond to those topics, and is pretty good overall. I'm going to run some of these models locally and will do a video/write-up about that.

Juraj 3d ago

nostr:nevent1qqs07jnqssxc83ygrgwl5vtxkkzvjshehfmms7t0a233r5d0lnpc09qzyrdtd3sxt3pehxa0kzc0rl66p35zww7wtsv4nfq43tt2wzz375rmvqcyqqqqqqgzqsg2x