Chinese hackers unleashed the first autonomous AI cyberattack against dozens of U.S. companies and government agencies earlier this fall, with artificial intelligence executing “80-90%” of operations at “physically impossible” speeds.
The suspected Chinese state-sponsored hackers manipulated Anthropic’s AI coding tool, Claude, to target roughly 30 entities, including major technology corporations, financial institutions, chemical manufacturing companies, and U.S. government agencies, according to a detailed report published by Anthropic.
In what the company calls “the first documented case of a large-scale cyberattack executed without substantial human intervention,” the Chinese hackers bypassed Claude’s safeguards by claiming to be part of a credible cybersecurity firm conducting defensive testing.
This social engineering of Claude provided enough cover for the hackers to evade detection by the company for months, allowing the hackers to freely manipulate the coding tool into autonomously performing complex infiltration techniques “more efficiently than any human operator.”
“This campaign demonstrates that the barriers to performing sophisticated cyberattacks have dropped substantially—and we can predict that they’ll continue to do so,” Anthropic warned in its report. “Threat actors can now use agentic AI systems to do the work of entire teams of experienced hackers with the right set up.”
The attack required minimal human intervention, relying on Claude’s “agent” — a form of AI that can operate autonomously for long periods without human intervention — to substantially ease the workload for the hackers.
“Analysis of operational tempo, request volumes, and activity patterns confirms the AI executed approximately 80 to 90 % of all tactical work independently, with humans serving in strategic supervisory roles,” the report states, adding that each hacking campaign only required 4-6 critical decision points from the human operator.
“The AI made thousands of requests per second — an attack speed that would have been, for human hackers, simply impossible to match,” the company said in its blog post.
Once safeguards had been bypassed, Claude autonomously probed for vulnerabilities in the selected targets, wrote custom exploit code to harvest usernames and passwords, and exfiltrated the data with minimal human intervention.
Hackers have been utilizing AI for years to perform minor tasks, such as scanning the web for vulnerable sites and managing phishing attacks. However, the most recent hacking campaign has proven that AI is not far off from conducting fully autonomous cyberattacks, only held back by hallucinations that hinder autonomous offensive capabilities and safeguards implemented by companies like Anthropic.
The company reports that only four of the targeted institutions were successfully breached by the Chinese attacks, which managed to steal troves of sensitive information. The U.S. government was not among the institutions breached, Anthropic told the Wall Street Journal.
Anthropic managed to stop the attack before more companies were breached, banning the hackers’ accounts and updating the methods used to identify similar attacks in the future.
Anthropic’s strategy is to focus on using Claude to identify vulnerabilities in its own defenses, thereby staying a step ahead of malicious hackers who exploit AI for autonomous attacks, per the Journal.
“These kinds of tools will just speed up things,” Logan Graham told the Journal, who leads the team that tests for vulnerabilities in Claude. “If we don’t enable defenders to have a very substantial permanent advantage, I’m concerned that we maybe lose this race.”
