Anthropic uncovers a major AI-led cyberattack
The investigation by Anthropic demonstrates how agentic AI now performs reconnaissance, exploit development, and data extraction, replacing human teams and lowering barriers for sophisticated cyber operations.
The US R&D firm, Anthropic, has revealed details of the first known cyber espionage operation largely executed by an autonomous AI system.
Suspicious activity detected in September 2025 led to an investigation that uncovered an attack framework, which used Claude Code as an automated agent to infiltrate about thirty high-value organisations across technology, finance, chemicals and government.
The attackers relied on recent advances in model intelligence, agency and tool access.
By breaking tasks into small prompts and presenting Claude as a defensive security assistant instead of an offensive tool, they bypassed safeguards and pushed the model to analyse systems, identify weaknesses, write exploit code and harvest credentials.
The AI completed most of the work with only a few moments of human direction, operating at a scale and speed that human hackers would struggle to match.
Anthropic responded by banning accounts, informing affected entities and working with authorities as evidence was gathered. The company argues that the case shows how easily sophisticated operations can now be carried out by less-resourced actors who use agentic AI instead of traditional human teams.
Errors such as hallucinated credentials remain a limitation, yet the attack marks a clear escalation in capability and ambition.
The firm maintains that the same model abilities exploited by the attackers are needed for cyber defence. Greater automation in threat detection, vulnerability analysis and incident response is seen as vital.
Safeguards, stronger monitoring and wider information sharing are presented as essential steps for an environment where adversaries are increasingly empowered by autonomous AI.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
