Is DeepSeek an Extinction Event?; A New Reasoning-Focused AI Lab Lands Funding
If you spent any time yesterday or this weekend on X, you might have concluded that AI developers, venture capitalists, Nvidia and cloud providers are totally screwed, thanks to new uber-cheap and impressive AI models from High-Flyer Capital Management, a Chinese hedge fund.
The truth is a lot more nuanced, though, and there are still a lot of questions we have yet to answer. The TL;DR is that the success of the DeepSeek models is more due to engineering prowess than research breakthroughs. The models are still fairly difficult to run because of their size (and businesses aren’t using the High-Flyer provided application programming interface due to security concerns). The models’ advances raise real questions about the enormous amounts of capital chipmakers and cloud providers are spending on AI infrastructure, although it’s difficult to independently verify that High-Flyer really did spend so little to develop its DeepSeek models.
The first big question is how much of a technological leap forward the DeepSeek models really are. Researchers tell us that the DeepSeek papers feature impressive research in areas like “multi-token predictions” (essentially, allowing AI models to predict multiple future tokens—another way to say words or part of words—at the same time) and reinforcement learning (giving feedback to a model on its reasoning processes without needing expensive human input).