Three AI laboratories have been found conducting large-scale illicit campaigns to extract capabilities from Anthropic’s Claude AI, the company revealed.
DeepSeek, Moonshot, and MiniMax used around 24,000 fraudulent accounts to generate more than 16 million interactions, violating terms of service and regional access restrictions. The technique, called distillation, trains a weaker model on outputs from a stronger one, speeding AI development.
Distilled models obtained in this manner often lack critical safeguards, creating serious national security concerns. Without protections, these capabilities could be integrated into military, intelligence, surveillance, or cyber operations, potentially by authoritarian governments.
The attacks also undermine export controls designed to preserve the competitive edge of US AI technology and could give a misleading impression of foreign labs’ independent AI progress.
Each lab followed coordinated playbooks using proxy networks and large-scale automated prompts to target specific capabilities such as agentic reasoning, coding, and tool use.
Anthropic attributed the campaigns using request metadata, infrastructure indicators, and corroborating observations from industry partners. The investigation detailed how distillation attacks operate from data generation to model launch.
In response, Anthropic has strengthened detection systems, implemented stricter access controls, shared intelligence with other labs and authorities, and introduced countermeasures to reduce the effectiveness of illicit distillation.
The company emphasises that addressing these attacks will require coordinated action across the AI industry, cloud providers, and policymakers to protect frontier AI capabilities.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
