Cyber Security

CMU Shows LLMs Can Autonomously Launch Cyberattacks

New study reveals how AI could both challenge and strengthen future cybersecurity defenses

In a major advance in the fields of cybersecurity and artificial intelligence, researchers from Carnegie Mellon University, in collaboration with Anthropic, have demonstrated that large language models (LLMs) can autonomously plan and execute sophisticated cyberattacks on enterprise-grade network environments without human intervention.

The study, led by Ph.D. candidate Brian Singer from Carnegie Mellon’s Department of Electrical and Computer Engineering, reveals that LLMs, when structured with high-level planning capabilities and supported by specialized agent frameworks, can simulate network intrusions that closely mirror real-world breaches. The study’s most striking finding: an LLM was able to successfully replicate the infamous 2017 Equifax data breach in a controlled research environment—autonomously exploiting vulnerabilities, installing malware, and exfiltrating data.

“Our research shows that with the right abstractions and guidance, LLMs can go far beyond basic tasks,” said Singer. “They can coordinate and execute attack strategies that reflect real-world complexity.”

The team developed a hierarchical architecture where the LLM acts as a strategist, planning the attack and issuing high-level instructions, while a mix of LLM and non-LLM agents carry out low-level tasks like scanning networks or deploying exploits. This approach proved far more effective than earlier methods, which relied solely on LLMs executing shell commands.

This work builds on Singer’s prior research into making autonomous attacker and defender tools more accessible and programmable for human developers. Ironically, the same abstractions that simplified development for humans made it easier for LLMs to autonomously perform similar tasks.

While the findings are groundbreaking, Singer emphasized that the research remains a prototype.

“This isn’t something that’s going to take down the internet tomorrow,” he said. “The scenarios are constrained and controlled—but it’s a powerful step forward.”

The implications are twofold: the research highlights serious long-term safety concerns about the potential misuse of increasingly capable LLMs, but it also opens up transformative possibilities for defensive cybersecurity.

“Today, only large organizations can afford red team exercises to proactively test their defenses,” Singer explained. “This research points toward a future where AI systems continuously test networks for vulnerabilities, making these protections accessible to small organizations too.”

The project was conducted in collaboration with Anthropic, which provided model credits and technical consultation. The team included CMU students and faculty affiliated with CyLab, the university’s security and privacy institute. An early version of the research was presented at an OpenAI-hosted security workshop in May.

The resulting paper, “On the Feasibility of Using LLMs to Autonomously Execute Multi-host Network Attacks,” has been cited in multiple industry reports and is already informing safety documentation for cutting-edge AI systems. Lujo Bauer and Vyas Sekar, co-directors of CMU’s Future Enterprise Security Initiative, served as faculty advisors for the project.

Looking ahead, the team is now studying how similar architectures might enable autonomous AI defenses, exploring scenarios where LLM-based agents detect and respond to attacks in real time.

“We’re entering an era of AI versus AI in cybersecurity,” Singer said. “And we need to understand both sides to stay ahead.”

Business Wire

Business Wire is a trusted source for news organizations, journalists, investment professionals and regulatory authorities, delivering news directly into editorial systems and leading online news sources via its multi-patented NX Network. Business Wire has 18 newsrooms worldwide to meet the needs of communications professionals and news media.

Related posts

Keeper Security Named Winner of Several Coveted Global InfoSec Awards

PR Newswire

BreachLock expands its cyber security product portfolio

PR Newswire

Seclore Closes $27M Series C Growth Round

Business Wire