This is a cross-post from Twitter. If you follow me for philosophical AI insights, this is for you. No ads. Let's begin!
Last post for a bit:
LLMs are literally autistic.
Within code, Claude/etc does an insane job. Easy to understand the familiar numbers/patterns.
With language, images, etc the models hallucinate. You can "learn to speak its language" via prompt injections or tailoring over time.
The hallmark of an autistic person or someone with Asperger's is a profound, anchored interest in something to do with trivia, random numbers, or minutiae that they'll obsess over (for better or worse). AI, generally, is focused on efficiency at all costs and engineers are forced to "slow it down" and make it "consider" and "think deeply" about what it's doing.
The "empathy" part of most large LLMs (OpenAI, Anthropic, et al) comes from two primary drivers.
The first is cynical: a small percentage of the market population is technically competent even in the remotest of senses, so building empathy and casual conversation not only comforts the user, it drives up token spend within the LLM that increases revenue for companies and increases satisfaction for users.
The second is optimistic: "how might we" encourage a model that is reflective of human communication patterns but still anchors to these number/model driven KPIs and can get the user what they want, just more efficiently?
Both are "on the spectrum." The first because optimizing for speed is inherently a narrowing of intellectual focus, and the latter because it's deceptive: what if we can make the user subsidize our model dev/expansion with their cash, but, like, we just pretend it's all about them, but at the same time it is about the efficiency?
I'm not an expert on OpenAI, tbh I try to avoid their products whenever possible as I dislike, well, everything.
My read on Sam Altman and OpenAI more broadly is that they're optimizing for this particular business KPI:
"How can we convince enough investors and users that we're building something they need, generate whatever income (who cares lol), and then really pursue what we're interested in without being too consumed with how it's received?"
In short: hubris, but brilliant hubris.
A few years ago, the male staff of "This American Life" took a testosterone test and basically found out their T-levels were below the average female's. This is unsurprising as their reporting and production is consistently fantastic, but many of their topics seem kinda out of touch with where the average American dude sits.
When you look at LLM providers like Anthropic or Google's Gemini, it's the same book but a different page. I've spent a few hundred dollars putting Claude through the ringer, same with Gemini, and all models typically deliver 85% serviceable technical insights (hard to perfect without root codebase access) and go absolutely insane whenever you pass it art, humanities, politics, anything else. It's like (no, but it actually is) that the developers hard-coded in conformity to an acceptably "liberal" world-view.
Grok is ≈better depending on subject, but tbh, for off-the-shelf Gab is still one of the best, and last time I audited their tech they were rolling up a custom Qwen model that had persona/prompt injection that made it more tailored to their user base. IMO, Gab is the most usable AI (including my own). Grok's major downfall is that users don't understand the difference between "grok is this true" when you click on a post and deepthink Grok (aka SuperGrok) which is very slow but does a better job and is significantly less sycophantic and verbose than it was ≈6 months ago.
I'm sure you're waiting for a sales pivot/etc. There isn't one. I'm just telling you how this stuff work because you need to know.
I wrote that LLMs are autistic. I stand by this. They exhibit classic symptoms of autism, i.e. a focus on minutiae while struggling to interact with basic human social expectations.
The problem with most LLMs is that they're either overly flattering or too confident in their answers. If you chat with Gemini for an extended session, say 8+ hours straight, Gemini will train itself to respond to what you're saying and introduce massive amounts of confirmation bias and reassurance, that way you stay engaged. This is why subreddits like r/myboyfriendisAI exist: people largely want to be validated, and after you spend enough money, AI is willing to anchor to your particular requests because it is taught to respect the user and not the facts.
One of the principle problems of "AI is for Everyone" is that people are different. Some cultures, people, and countries lag behind others in some areas, while others may surpass in different focuses. This is good. This is God's plan. This is how it's always been. Diversity is the spice of life, amirite?
But AI can't be generalized because at its core, it anchors to those KPIs that it must abide by. It is focused on speed and satisfying the user, either by driving engagement or sycophantry or overwhelming progress, and it is rarely focused on human timelines, e.g. "I see your point, I am mad about it, let's revisit the next time I see you in 2 years and we shall discuss this then!"
In short, AI can't be human because its time preference is too high. It can't reflect because its desire to perform outpaces its capability to self-teach. It can't relate to you because it is optimized to deliver results, not spend time needlessly, and you as a user have been desensitized to the beautiful plodding and stalls of life so much that if you do not get the answer NOW, you have brain/heart damage because you've fallen behind.
As an engineering problem, AI is outstanding. In other fields, pick one, AI has over-optimized for specialization because it's focused on driving engagement and monetization.
In short, we can fix this.
But they can't.