Anthropic accuses 3 Chinese companies of collecting its data

Anand Kumar
By
Anand Kumar
Anand Kumar
Senior Journalist Editor
Anand Kumar is a Senior Journalist at Global India Broadcast News, covering national affairs, education, and digital media. He focuses on fact-based reporting and in-depth analysis...
- Senior Journalist Editor
6 Min Read

Anthropic accuses 3 Chinese companies of collecting its data

Anthropic’s terms prevent anyone from collecting data for distillation. Separately, due to not sharing its technology with the Pentagon, the company’s CEO Dario Amodei (pictured inset) outlined his ethical concerns about unsupervised government use of AI

SAN FRANCISCO: San Francisco-based AI startup Anthropic has accused three Chinese companies of improperly collecting large amounts of data from their own AI technologies in an attempt to speed up the development of their own systems. DeepSeek, Moonshot and MiniMax — three prominent Chinese startups — used about 24,000 fraudulent accounts to create more than 16 million conversations using its Claude chatbot that can be used to teach skills to its chatbots, Anthropic said in a blog post Monday. Using data from one AI system to train another system — a process called distillation — is common in AI work. But Anthropic’s terms of service prohibit anyone from surreptitiously collecting data for distillation, and do not allow its technology to be used in China. Anthropic’s main competitor, OpenAI, has also accused Chinese companies of lifting large amounts of data from its chatbot, ChatGPT, to make similar offers. In a memo sent to the House Select Committee on China last week, OpenAI said DeepSeek and other Chinese startups were using new and “opaque” distillation methods as part of their “continued efforts to free-ride” on technologies developed by OpenAI and other U.S. companies.

Companies. Like OpenAI, Anthropic said the practice poses a national security risk, adding that it could allow China to build artificial intelligence technologies to create bioweapons or tools for mass surveillance. The startup has guardrails on its technology designed to prevent it from being used in these ways, but the guardrails can be removed during distillation. Anthropic called on government officials and other AI companies to help prevent Chinese companies from extracting American models. “These campaigns are increasing in intensity and sophistication,” Anthropic said in its post. “The window to act is narrow, and the threat extends far beyond any single company or region. Addressing this threat will require rapid, coordinated action among industry players, policy makers, and the global AI community.” DeepSeek, Moonshot and MiniMax did not immediately respond to requests for comment. Anthropic published its post amid a conflict with the Department of Defense over the Pentagon’s use of its technology.

The Pentagon has agreed to use Anthropic’s technologies for classified missions, but is threatening to cut ties with the startup because Anthropic does not want its technologies used in situations involving autonomous weapons or domestic surveillance. Last year, DeepSeek spooked Silicon Valley tech companies and sent U.S. financial markets into chaos after launching artificial intelligence technologies that matched the performance of everything else on the market. Until then, the prevailing wisdom in Silicon Valley was that the most powerful systems could not be built without purchasing billions of dollars’ worth of specialized computer chips. But DeepSeek said it created its technology using far fewer resources. Like US companies, DeepSeek, Moonshot and MiniMax build their own AI technologies using computer code and data collected online. AI companies around the world rely heavily on a practice called open sourcing, which means they freely share the code that supports their technologies and reuse code shared by others.

They see this as a way to accelerate technological development. AI companies also need vast amounts of online data to train their AI systems. Leading systems learn their skills by analyzing almost all the text on the Internet. Distillation is often used to train new systems. This is often allowed by open source technologies. But if a company takes data from proprietary technology, the practice can be legally problematic. Anthropic, now worth $380 billion, faces multiple lawsuits accusing it of illegally using copyrighted Internet data to train its systems. In September, as part of a landmark legal settlement, Anthropic agreed to pay $1.5 billion to a group of authors and publishers after a judge ruled that it illegally downloaded and stored millions of copyrighted books. This was the largest reparations payment in US history

Copyright cases. OpenAI and other AI companies are facing similar lawsuits, including one filed by The New York Times against OpenAI and its partner Microsoft. The lawsuit asserts that millions of articles published by The Times were used to train automated chatbots that now compete with the media as a source of reliable information. Both OpenAI and Microsoft deny these allegations. This article originally appeared in the New York Times.

Share This Article
Anand Kumar
Senior Journalist Editor
Follow:
Anand Kumar is a Senior Journalist at Global India Broadcast News, covering national affairs, education, and digital media. He focuses on fact-based reporting and in-depth analysis of current events.
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *