Nostr Web Client

I am lucky enough to have a PC that can run the lastest gpt-oss models from OpenAI. I am not impressed. It is like talking to Drax the Destroyer from Guardians of the Galaxy. Even when you explain the joke to it, it still misunderstands.

It also lies. Probably not purposely, it merely says what a human might say even though it is impossible for it.

YODL 5mo ago

What do you use them for? And are you pooping on OpenAI models specifically over others?

Reply to this note

Please Login to reply.

Discussion

YODL 5mo ago

Nm, seems from replies you like grok (gross; have never tried it).

Daniel Wigton 5mo ago

What is gross about grok?

YODL 5mo ago

Daniel Wigton 5mo ago

That isn't a picture of grok.

YODL 5mo ago

It has Musk's stench all over it. I can't help but want to dislike it.

Daniel Wigton 5mo ago

I don't think Musk had much to do with the weights generated. He paid for and helped roll out the cluster it was trained on, and he set the priorities. "Be maximally truth seeking" but there is no way he had time to curate the data or even design the over-all architecture. Thus, the actual AI isn't particularly Musk-like. It is actually quite similar to other AI of similar complexity.

The difference is a subtle shift in the data used (a bit less woke (whatever that means any more)) and the way it perfoms search.

Musk himself doesn't have any magic answers as to what is truth and what isn't. He is good with physics and engineering, but kinda dumb in other areas. So what you get is an AI that is better than average at searching the web to give the answer du jour.

It's a tool, it is good at its job or not irrespective or not of who made it.

For coding I prefer Claude 3.7 and Chatgpt 3o (I think, I can't keep them straight) but I have tried grok 4. I don't have money to throw around.

YODL 5mo ago

You have the zap I sent you. Sit on it a few more months and you'll have funds for all the LLMs!

I have lots of random questions about llms, but will hold off until I've tried some a bit more and have something semi-intelligent to say

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 5mo ago

if it was about the CEOs he would be preferable to slimy sam altman

YODL 5mo ago

Fair point to consider

Daniel Wigton 5mo ago

No, I usually express my disappointment every time a new local AI comes out. These are just the newest. Generally my disappointment comes from models that try to do too much for their parameter count.

In my testing "small" (70b ish parameter) models are not big enough to do actual reasoning. I think you need to hit some magic scale where it pays off. Smaller than that they just make noises that make it look like they are reasoning, but the output is no better or even worse than a classic LLM.

For this reason I like Meta's llama 3.x models. They are unsurpassed at prompt following and they give as good of answers as you can expect for their parameter count.

You can get some improvement out of them, in some circumstances, if you prompt them to spell out their reasoning. They don't actually reason, but it allows them to correctly count r's in strawberry for instance.

I will pay OpenAI this compliment though. Their 120b parameter models somehow runs twice as fast as llama 70b parameters, on my machine. I don't think it is mixture of experts, I think they did something clever with the quantization. I haven't figured it out yet. It seems impossible. It should be bound by my RAM speed.

YODL 5mo ago

I lurnt some stuff from reply, thanks.

I'm mostly interested in coding agents, something I plan to play with down the line; I haven't used much beyond cursor a couple times to mess around. Not even sure if I can use llama inside it, but I assume so.

Local llm would be nice for but I'm guessing that's out of reach for my current tech and abilities