News analysis

Four big reasons to worry about DeepSeek (and four reasons to calm down)

Platformer · Casey Newton · Last updated

I. DeepSeek rising

Today, let’s talk about DeepSeek.

On Monday, the Nasdaq fell 3.1 percent as investors considered what the Chinese company’s high-performing, cheap-to-train, App Store-topping new models meant for the future of artificial intelligence and the tech industry overall. The leading maker of specialized AI chips, Nvidia, suffered most of all — its stock price dropped 17 percent, erasing $600 billion in market value.

As news of DeepSeek’s achievement spread over the weekend, it became a kind of Rorschach test. While everyone is impressed that DeepSeek built the best open-weights model available for a fraction of the money that its rivals did, opinions about its long-term significance are all over the map. 

To many prominent voices in AI, DeepSeek seems to have confirmed what they already believed. To AI skeptics, who believe that AI costs are so high that they will never be recouped, DeepSeek’s success is evidence of Silicon Valley waste and hubris. To AI bulls, who think America needs to build artificial general intelligence before anyone else as a matter of national security, DeepSeek is a dire warning to move faster. And to AI safety researchers, who have long feared that framing AI as a race would increase the risk of out-of-control AI systems doing catastrophic harm, DeepSeek is the nightmare that they have been waiting for.

Whatever the truth is won’t be known for some time. Reading the coverage over the past few days, and talking with folks who work in the industry, I’m convinced that DeepSeek is a huge story deserving of our ongoing attention. At the same time, I’m not sure that the emergence of a powerful, low-cost Chinese AI model changes the dynamics of competition quite as much as some observers are saying. 

With that in mind, here are some reasons worth worrying about DeepSeek — and some reasons to calm down.  

II. Reasons to worry

No one really knows what DeepSeek’s long-term game is. As you may know by now, DeepSeek was created by a 10-year-old Chinese quantitative hedge fund named High-Flyer; Liang Wenfeng, DeepSeek’s CEO, is also a cofounder of the fund. High-Flyer developed AI algorithms for use in trading; in 2023, it started a lab to build AI tools unrelated to its core business. 

Over the next year or so, it made a series of technical innovations in building large language models. Its stated mission, as posted on its X profile, is to “Unravel the mystery of AGI with curiosity.” The company has committed to open-sourcing its models, and has offered them to developers at very cheap prices.

For the moment, DeepSeek doesn’t seem to have a business model to match its ambitions. For most big US AI labs, the (yet unrealized) business model is to develop the best service and to sell it at a profit. To date, DeepSeek has positioned itself as a kind of altruistic giveaway.

That could change at any time. DeepSeek could introduce subscriptions, or place new restrictions on its developer APIs. Zvi Mowshowitz theorizes that the company could take user data and give it to the hedge fund for trading insights.

And at some point, the Chinese government will have something to say about one of its companies trying to give away powerful AI to anyone who wants it, including China’s adversaries. 

In the meantime, though, we can only guess what DeepSeek’s ambitions are. And that worries me, because in some very real sense we don’t know what we’re dealing with here. 

The big AI labs don’t seem to have much of a moat. Somewhat lost in the DeepSeek conversation so far is that the company’s impressive v3 and r1 models were built on top of American innovations. It was the US AI labs who developed the underlying architecture for large language models and the newer reasoning models; what DeepSeek has done is to cleverly optimize that architecture using old hardware and less computing power. 

In the old days, by which I mean the time of GPT-3, it took OpenAI’s rivals months or longer to reverse-engineer its process and absorb its innovations. It might take a year for those techniques to filter down to the open-source models that are made available for free. 

But DeepSeek shows that the open-source labs have gotten much better at reverse-engineering — and that any lead the US AI labs come up with can be quickly erased. This is a problem if your main business is selling models to developers: switching costs are low, and the cost savings they can achieve by using DeepSeek are huge.

For the AI labs, this is a business problem. But it could be a geopolitics problem, too: DeepSeek’s innovation shows that there will be no keeping AI out of anyone’s hands, for better and for worse. As Anthropic co-founder Jack Clark put it today: “DeepSeek means AI proliferation is guaranteed.”

DeepSeek’s success will lead more people to see AI as an arms race against China.

For some venture capitalists in particular, it has long been a goal to frame AI progress as a contest against China. This idea was central in “Situational Awareness,” Leopold Aschenbrenner’s viral essay from last year about AI progress, whose publication coincided with the former AI researcher announcing that he had started a new venture capital firm.

VCs love this framing for lots of reasons. It builds on a rational fear: that an authoritarian government will create superhuman intelligence before democracies do and use it against them. But it is also meant to serve as ammunition against regulations, which would slow both AI progress and returns to VCs’ portfolios; and to drum up interest in military tech, which generates further returns to those same portfolios. 

In general I think we should be against the world’s richest people cheerleading us into armed conflict.

The more that people believe AI is an existential race against China, the less safely it will be built.   

Say what you will about the failures of US AI labs — and there are many — but they have at least tried to outline methods for building powerful AI safely. DeepSeek, by contrast, has not said one word about AI safety. If it has a lone AI safety researcher, that’s news to me. 

To accelerationists, this could be a reason for US companies to abandon their safety efforts — or at least, reduce future investments in them.

It’s important to remember that all of the most important AI safety problems remain unsolved. Should one of the corporate AI labs suddenly invent and release a superhuman intelligence, there is no way to ensure it is aligned with human values or desires, and no plan for what to do next. The Biden administration placed some gentle restrictions on US AI labs with an executive order, but Trump repealed it on day one. 

As Mowshowitz writes: “These people really think that the best thing humanity can do is create things smarter than ourselves with as many capabilities as possible, make them freely available to whoever wants one, and see what happens, and assume that this will obviously end well and anyone who opposes this plan is a dastardly villain.”  

III. Reasons for calm

Everyone basically already assumed that all of this was going to happen. By “all of this,” I mean that (1) open-source companies would reverse-engineer everything the big labs are doing and (2) that costs for AI training and inference would decline dramatically over time. 

Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania, noted over the weekend that “costs for GPT-4 level intelligence dropped by 1000x in the last 18 months.” For that reason, he writes, “A 95% price drop in reasoning models seems not to be something that will break the labs.”

Funnily enough, Google released its own super-cheap reasoning model just five days ago, built on similar techniques, and no one cared at all. (It’s not open source or free or Chinese. But still!)

Anyone who has sent the same query to ChatGPT, Claude, and Gemini on the same day has known for more than a year that you can get basically as good an answer from any of them. And anyone who has used Llama has known for more than a year that the open-weights version that arrives later is only slightly worse.

Right now a lot of investors are catching up to these basic facts at the same time, and stock prices are falling accordingly. But it’s not clear to me that much of it was really news to the AI labs and tech platforms.  

Tech companies can still find good use for all those AI capital expenditures.

American tech companies plan to spend tens of billions of dollars building data centers to serve their AI needs this year. One question many folks were asking today was whether DeepSeek would make all of those investments moot. If you can build a best-in-class model with older hardware, what’s the point?

The point is to (1) train more powerful models and explore techniques that open-source developers haven’t ripped off yet; and (2) to serve the demand that those more powerful and capable models generate. While much of the AI discussion over the past six months has revolved around the bottleneck that a lack of chips has created for training new models, the real bottleneck is that no one has as much computing power as they want. 

For the most part, the same servers and chips that fund model training can be used for inference. DeepSeek’s innovation means that the day you can run a state-of-the-art model on your laptop is much closer. But we’re not there yet.

The export controls are helping.

Some observers have said that DeepSeek’s progress shows that the Biden administration’s restrictions on chip exports have failed. I don’t think that’s right. As Jordan Schneider writes at ChinaTalk, these export controls are relatively new — and need more time to really have an effect.

A primary effect of the export controls is that China should have less computing power than the United States overall for some time. That means that even as Chinese companies like DeepSeek release more powerful models, China may not be able to deploy them as widely as it would like to.

That same computing power is also essential for inventing more powerful AI systems. As Miles Brundage, a former policy researcher at OpenAI, put it recently on a podcast with Schneider: “There are all sorts of ways of turning compute into better performance, and American companies are currently in a better position to do that because of their greater volume and quantity of chips.”

American AI labs are still in the lead

DeepSeek’s innovations are real, and they go a long way toward making the AI systems we have today cheaper and more accessible. But to the extent that they represent conceptual breakthroughs, it’s only in optimizing technology that OpenAI and others invented first. There can be a ton of value in optimization, of course, and I’m sure every American company wishes they had come up with these first. (Meta’s teams are now reportedly in overdrive trying to reverse-engineer them.) 

At the same time, it was only last week that OpenAI made available to Pro plan users a computer that can use itself. The whole US AI industry has shifted its focus to building AI agents and full-fledged virtual coworkers. To date, very little of that is visible. (And as I wrote last week, OpenAI’s Operator can be unbelievably frustrating to use.) Perhaps DeepSeek or another Chinese company could beat the United States to the punch with agents. But it seems likelier that they will simply wait for an American company to release a good one and try to copy it. 

I can understand why some people look at the progress DeepSeek has made and assume that it is about to overtake every US lab. If the company is as careless about safety as it appears to be, there may someday be an actual reason to panic. For the moment, though, I think everyone would benefit from taking a few long deep breaths. 

Elsewhere in DeepSeek:

 

Governing

 

 

Industry

 

Those good posts

For more good posts every day, follow Casey’s Instagram stories.

 

(Link)

 

(Link)

 

(Link)

 

Talk to us

Send us tips, comments, questions, and DeepSeek queries: casey@platformer.news. Read our ethics policy here.