“This is a classic disruption story: Incumbents optimize existing processes, while disruptors rethink the fundamental approach. DeepSeek asked "what if we just did this smarter instead of throwing more hardware at it?"”

https://threadreaderapp.com/thread/1883686162709295541.html

Reply to this note

Please Login to reply.

Discussion

I keep thinking there might be a point when the efficiency gains that let the smaller models keep up with the big ones run out (as the big players adopt these too), then the only way to match capability will be to have nearly the same 2 Trillion or whatever parameter size. Hope open source keeps up! 🙏

Yea I wonder what the coming days will look like now that the cat is out of the bag. The big players won’t sit idly by for sure.

Of course consumer hardware is advancing too. Supposedly with a couple of Nvidia’s upcoming Digits computers connected together you’ll be able to run a 400B parameter model locally.

I’m curious. Essentially it sounds like deep seek just figured out how to be efficient. How to do more (or the same) with less. What if you combine this efficiency WITH the massive amounts of hardware? Does that get you anywhere? I.e. does the efficiency plus “lots of hardware” get you any breakthroughs?

That’s what the big players like (Closed)OpenAI are going to bet on - adopt these efficiencies as they pop up and keep scaling, hoping that their capabilities will only be matched by similar scale as efficiency improvements run out.

Been saying this for 2 fucking years. Wall Street normies are so goddamn ignorant.

I have AI on my laptop locally that's as good as ChatGPT in a data warehouse.

Add logic into the mix and it's unstoppable.

I'm curious what LLM was used in making that thread, because it seems 3x as long as it would need to be and very repetitive. You can just smell the LLM in texts.