Replying to Avatar Guy Swann

So what’s all the hoopla about DeepSeek and why is it breaking everybody’s brain right now in Ai?

I’ve been doing a dive for a couple of days and these are the main deets I’ve pulled together, will have a Guy’s Take on it soon, so stay tuned to the nostr:npub1hw4zdmnygyvyypgztfxn8aqqmenxtwdf3tuwrd44stjjeckpc37q6zlg0q feed

DeepSeek ELI5:

• US has been hailed as the leader in Ai, while pushing fears that we need to be closed and not share with China cuz evil CCP and they can’t figure it out without us

• ChatGPT and “Open”Ai is poster child, eating up retarded amounts of capital for training and inference (using) LLMs. Estimates say around $100 million or more for ChatGPT o1 model.

• In just a couple of weeks China drops numerous open source models with incredible results, Hunyuan for video, Minimax, and now DeepSeek. All open source, all insanely competitive with the premiere closed source in the US.

• DeepSeek actually surpassed ChatGPT o1 on most benchmarks, particularly math, logic, and coding.

• DeepSeek is also totally open with how its thought process works, it explains and shows its work as it runs, while ChatGPT makes that proprietary. This makes building with, troubleshooting, and understanding with DeepSeek much better.

• DeepSeek is also multimodal, so you can give it PDFs, images, connect it to the internet, etc. it’s a literal full personal assistant with just a few tools to plug into it.

• The API costs 95% LESS than ChatGPT API per call. They claim that is a profitable price as well, while OpenAi is bleeding money.

• They state that DeepSeek cost only $5.6 million to train and operate.

• Capital controls on GPUs and chips went into effect in the past year or two trying to prevent China from “catching up,” and it seems to have failed miserably. As it seems China was able to do 20x the results per dollar with inferior hardware.

• The US model of Ai, its costs, its capes structure, and the massive demand for chips has been the model for assessing the valuation, pricing, and future demand of the entire Ai industry. DeepSeek just took a giant dump on all of it by out performing and spending a tiny fraction to achieve it while also dealing with lack of access to the newest chips.

All of this together is why people are freaking out about a plummet to Nvidia price, reevaluation of OpenAi, and the failure of US to stay dominant or even the legitimacy of staying proprietary as it may just cause us to fall behind rather than lead. All after a $700 billion investment was just announced that now just kinda looks like incompetent corporations wasting horrendous amounts of money for something they won’t even share with people, that you can’t run locally, and is surpassed by a few lean Chinese startups with barely a few million.

Almost 100% agree, but there is no way they did this without Nvidia chips.

They've open sourced a ground breaking LLM scaling paradigm (RL on COT), which is no small thing believe me, but our closed source reasoning models are likely doing something similar (we just can't see it).

This newly open scaling paradigm is a game changer, but you still don't get this performance without massive compute.

They have illegal H100s, I'm nearly certain of it. Nvidia was probably due for a correction anyway. But it'll be funny to see what happens when we all find out they did this with Nvidia chips

Reply to this note

Please Login to reply.

Discussion

to expand on that a bit. They are using secret H100s, therefore their capex claims are complete BS, therefore their API price is complete BS. CCP smuggled in our chips and is bankrolling a loss to shake the market. Pretty freaking smart tbh

They claimed to have stock pilled them before the ban. Not that they didn’t use them at all. They just used less of them.

For what it’s worth.

I would bet they've continued stockpiling them.

Indeed.

What they claim….FYI

This honestly wouldn’t surprise me in the least

Sorry I didn’t mean to say they didn’t have Nvidia chips, but more that they likely are paying higher price and it’s slightly harder to get ahold of the same amount of compute. Or at least this was the goal of the govt actions.

So either:

• it did nothing and they have easy access. Or,

• access is slightly more difficult but it didn’t matter.

My bullet point was kinda vague and implied what your interpretation was but that’s not exactly what I meant.

Just posted this. A great breakdown of why and how. I’m curious if his take is technically accurate.

nostr:note1rz3tnf7qcrseqyunefq43pm8hcx9myl20hwu2gas7tdjwew29kls2zz7ln

Do you have a good write up on the scaling paradigm? I read that in another post but couldn’t confirm it yet and wasn’t sure what that meant.

Any explanation or breakdown link would be appreciatively zapped