Cade Metz
The New York Times
TT
20

What to Know About DeepSeek and How It Is Upending A.I.

Tech stocks tumbled. Giant companies like Meta and Nvidia faced a barrage of questions about their future. Tech executives took to social media to proclaim their fears.

And it was all because of a little-known Chinese artificial intelligence start-up called DeepSeek.

DeepSeek caused waves all over the world on Monday as one of its accomplishments — that it had created a very powerful A.I. model with far less money than many A.I. experts thought possible — raised a host of questions, including whether US companies were even competitive in A.I. anymore.

DeepSeek is “AI’s Sputnik moment,” Marc Andreessen, a tech venture capitalist, posted on social media on Sunday.

How could a company that few people had heard of have such an effect?

What is DeepSeek?

DeepSeek is a start-up founded and owned by the Chinese stock trading firm High-Flyer. Its goal is to build A.I. technologies along the lines of OpenAI’s ChatGPT chatbot or Google’s Gemini. By 2021, DeepSeek had acquired thousands of computer chips from the US chipmaker Nvidia, which are a fundamental part of any effort to create powerful A.I. systems.

Over the past few years, DeepSeek has released several large language models, which is the kind of technology that underpins chatbots like ChatGPT and Gemini. On Jan. 10, it released its first free chatbot app, which was based on a new model called DeepSeek-V3.

Why did the stock market react to it now?

When DeepSeek introduced its DeepSeek-V3 model the day after Christmas, it matched the abilities of the best chatbots from US companies like OpenAI and Google. That alone would have been impressive.

But the team behind the new system also revealed a bigger step forward. In a research paper explaining how it built the technology, DeepSeek said it used only a fraction of the computer chips that leading A.I. companies relied on to train their systems.

The world’s top companies typically train their chatbots with supercomputers that use as many as 16,000 chips or more. DeepSeek’s engineers said they needed only about 2,000 Nvidia chips.

How did DeepSeek make its tech with fewer A.I. chips?

Top A.I. engineers in the United States say that DeepSeek’s research paper laid out clever and impressive ways of building A.I. technology with fewer chips.

In short, the startup’s engineers demonstrated a more efficient way of analyzing data using the chips. Leading A.I. systems learn their skills by pinpointing patterns in huge amounts of data, including text, images and sounds. DeepSeek described a way of spreading this data analysis across several specialized A.I. models — what researchers call a “mixture of experts” method — while minimizing the time lost by moving data from place to place.

Is DeepSeek’s tech as good as systems from OpenAI and Google?

DeepSeek-V3 can answer questions, solve logic problems and write its own computer programs as effectively as anything already on the market, according to standard benchmark tests.

Just before DeepSeek released its technology, OpenAI had unveiled a new system, called OpenAI o3, which seemed more powerful than DeepSeek-V3. But OpenAI has not released this system to the wider public.

Then on Jan. 20, DeepSeek released its own reasoning model called DeepSeek R1, and it, too, impressed the experts. That eventually sent US investors and others into a panic late last week and over the weekend as they realized the importance of DeepSeek’s new technology.

Large numbers of A.I. chips can still help companies in many ways. With more chips, they can run more experiments as they explore new ways of building A.I.

Hasn’t the United States limited the number of Nvidia chips sold to China?

Yes. To maintain the US lead in the global A.I. race, the Biden administration had put in place rules limiting the number of powerful chips that could be sold to China and other rivals.

But the impressive performance of the DeepSeek model raised questions about the unintended consequences of the American government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet.

Some experts continue to argue in favor of US trade restrictions, saying that they were only recently put in place and that they will have a greater effect on China’s abilities to create A.I. as the years pass.

The New York Times