What do you use them for? And are you pooping on OpenAI models specifically over others?

Reply to this note

Please Login to reply.

Discussion

Nm, seems from replies you like grok (gross; have never tried it).

What is gross about grok?

That isn't a picture of grok.

It has Musk's stench all over it. I can't help but want to dislike it.

I don't think Musk had much to do with the weights generated. He paid for and helped roll out the cluster it was trained on, and he set the priorities. "Be maximally truth seeking" but there is no way he had time to curate the data or even design the over-all architecture. Thus, the actual AI isn't particularly Musk-like. It is actually quite similar to other AI of similar complexity.

The difference is a subtle shift in the data used (a bit less woke (whatever that means any more)) and the way it perfoms search.

Musk himself doesn't have any magic answers as to what is truth and what isn't. He is good with physics and engineering, but kinda dumb in other areas. So what you get is an AI that is better than average at searching the web to give the answer du jour.

It's a tool, it is good at its job or not irrespective or not of who made it.

For coding I prefer Claude 3.7 and Chatgpt 3o (I think, I can't keep them straight) but I have tried grok 4. I don't have money to throw around.

You have the zap I sent you. Sit on it a few more months and you'll have funds for all the LLMs!

I have lots of random questions about llms, but will hold off until I've tried some a bit more and have something semi-intelligent to say

if it was about the CEOs he would be preferable to slimy sam altman

Fair point to consider

No, I usually express my disappointment every time a new local AI comes out. These are just the newest. Generally my disappointment comes from models that try to do too much for their parameter count.

In my testing "small" (70b ish parameter) models are not big enough to do actual reasoning. I think you need to hit some magic scale where it pays off. Smaller than that they just make noises that make it look like they are reasoning, but the output is no better or even worse than a classic LLM.

For this reason I like Meta's llama 3.x models. They are unsurpassed at prompt following and they give as good of answers as you can expect for their parameter count.

You can get some improvement out of them, in some circumstances, if you prompt them to spell out their reasoning. They don't actually reason, but it allows them to correctly count r's in strawberry for instance.

I will pay OpenAI this compliment though. Their 120b parameter models somehow runs twice as fast as llama 70b parameters, on my machine. I don't think it is mixture of experts, I think they did something clever with the quantization. I haven't figured it out yet. It seems impossible. It should be bound by my RAM speed.

I lurnt some stuff from reply, thanks.

I'm mostly interested in coding agents, something I plan to play with down the line; I haven't used much beyond cursor a couple times to mess around. Not even sure if I can use llama inside it, but I assume so.

Local llm would be nice for but I'm guessing that's out of reach for my current tech and abilities