“To win this space you need a B2B biz model and you need legendary workflow management.”

So true. The amount of data wrangling that must be automated to reliably build models is a bigger lift than most realize because done properly the end user doesn’t know or care.

Reply to this note

Please Login to reply.

Discussion

From my own experience scrubbing up the data quality is the secret sauce in all of this, I would take halving the anomalies v’s 25x volume in any training dataset.

But my point above is really about the application of AI. I still think generalisation is a mirage, you want 10^6 specialist models each deployed on 10^3 instances rather than 1 general model deployed 10^9 instances.

You get those specialists via automated data quality ops.

Automate the folks doing the automating.

Build a specialist forest instead of a monolith.

Then you have to workflow the specialists.

Then make it dynamic / self organising.

This is the path.

Yeah I'm with you on that too. When people use the term "generalization" they're implicitly (and unknowingly) putting a frame around the information they seek or the tasks they're trying to accomplish. In a networked environment there's no need for an "everything" AI anyway. It's not a bounded concept nor an efficient design. There’s one caveat... Can we ensure billions of models will be able to communicate with each other?

I've been fascinated watching how GPT and DALL-E communicate instructions back and forth in plain English. The LLM as an API looks a lot more like how we human surface and transmit information with each other, where the underlying data we're using is encoded in a much more abstract way than we're able to communicate.

Yes, if you think about it… words themselves are extremely low bandwidth. So it initially seems human communication is extremely slow.

But combinations of just a fews words is actually an incredibly vast storage space (12 words prove this phenomena).

Sentences (not even paragraphs) have an enormous Shannon number, and therefore sentences are essentially an insanely powerful compression function for communication of the sophisticated abstractions that we each hold in our heads.

LLM’s have undeniably cracked this compression function and also now build their own internal abstractions from these decipherings. LLM’s are also able to communicate their abstractions the other way via the same compression function.

Those 3 capabilities are really impressive and were actually quite simple to reach through mere scaling.

But LLM’s are yet to emulate abstractions of our direct jungle experience. I think ML can do this, as we see with self guided rocket boosters, drones and driverless cars. I think the multimodal LLM’s are now beginning to bridge the two worlds of cyber and jungle. Once they do, the next step is that they can then navigate the jungle by being embodied as robots.

I expect we will see all that in around 10 years. The race is real, still not spoken about much is the fact that it solves demographic taxbase erosion without the need for inward migration for all the major economies and we are already seeing Western politics turning away from the immigration strategy.

You can walk through where this is all heading and the societal choices that populations and powers will face along the way.

But again, I think we will land on highly specialised forests and not monoliths. I think we will see declouding and embodiment and I think we will own local embodied private agents, with the state owning large numbers of their own public agents.

A social protocol of disseminated power will emerge and it will be enforced by the tyranny of majority.

I think society is going to go through a series of radical changes, migration U-turn will be the first, as we are collectively persuaded to collaborate with the interests of our local power centre (political capital).