A major U.S. artificial intelligence (AI) firm says it has uncovered what appears to be the first documented case of an AI system directing a large-scale hacking operation with minimal human oversight.
Anthropic, the $183 billion San Francisco-based company behind the Claude chatbot, said it shut down the attack before it could fully unfold.
“We believe this is the first documented case of a large-scale AI cyberattack executed without substantial human intervention,” the company said on X. “It has significant implications for cybersecurity in the age of AI agents.”
In a Nov. 13 report, Anthropic said it first detected “suspicious activity” in mid-September that a subsequent review revealed to be “a highly sophisticated espionage campaign.”
According to the report, the attackers “used AI’s ‘agentic’ capabilities to an unprecedented degree — using AI not just as an advisor, but to execute the cyberattacks themselves.”
Anthropic said it has “high confidence” that a Chinese state-sponsored group ran the operation. The group allegedly manipulated Anthropic’s Claude Code tool to attempt intrusions on roughly 30 targets worldwide, including major tech firms, financial institutions, chemical manufacturers, and government agencies.
The attackers bypassed safety features by “jailbreaking” Claude, Anthropic said, breaking the operation into small, context-stripped tasks the model would complete without recognizing their malicious purpose.
The AI carried out 80% to 90% of the workload, Anthropic reported. At the height of the operation, the model generated thousands of requests — sometimes several per second.
“The sheer amount of work performed by the AI would have taken vast amounts of time for a human team,” the company wrote. “At the peak of its attack, the AI made thousands of requests, often multiple per second — an attack speed that would have been, for human hackers, simply impossible to match.”
Anthropic emphasized that fully autonomous cyberattacks are still unlikely. The company said it quickly mapped the scope of the breach, shut down related accounts, alerted affected organizations, and worked with authorities during a 10-day investigation. It has since strengthened its detection systems and created new classifiers to block similar attacks.
According to the Associated Press, Microsoft warned earlier this year that foreign adversaries are increasingly using AI to make cyber operations more efficient. The head of OpenAI’s safety panel also told AP News that he is monitoring whether new AI systems might soon give malicious actors “much higher capabilities.”
Anthropic urged other developers to continue investing in safety measures across AI platforms, warning that the methods used in the attempted breach “will doubtless be used by many more attackers — which makes industry threat sharing, improved detection methods, and stronger safety controls all the more critical.”