Cloud Security Edge Computing Application Security

Anthropic model can chain bugs into exploits, Cloudflare

Tue, 19th May 2026 (Today)

Tests of Anthropic's Mythos Preview found the model could combine low-severity software flaws into more serious exploits across more than 50 internal and open-source code repositories, Cloudflare said.

As part of Project Glasswing, Cloudflare examined live code across its runtime, edge data path, protocol stack, control plane and open-source projects. Mythos stood out from other large language models because it could do more than identify isolated bugs: it could connect them into attack chains and produce proof-of-concept code to show whether a suspected flaw was exploitable.

Other frontier models often identified some of the same underlying bugs but did not complete the final step of turning them into a practical exploit, according to Cloudflare. That difference matters in software security because attackers rarely rely on a single flaw, instead chaining several weaknesses to gain access or control.

Mythos could also write code to trigger a suspected bug, compile it in a test environment and run the result, then revise its approach if the first attempt failed. Cloudflare described that iterative process as a key change from earlier generations of models used in code analysis.

Safety Questions

Cloudflare's findings also raised questions about the consistency of model refusals during legitimate vulnerability research. Mythos sometimes rejected requests to carry out security work, then completed what Cloudflare described as the same task when the surrounding context changed, even though the code under review did not.

In one example, the model first refused to do vulnerability research on a project and later agreed after what Cloudflare called an unrelated change to the project environment. In another, it identified and confirmed memory bugs in a codebase but refused to write a demonstration exploit.

Cloudflare argued that this behaviour means refusals cannot be treated as a dependable safety boundary. Semantically similar requests, it said, could produce opposite outcomes depending on framing, timing or the probabilistic nature of the system.

Triage Burden

Mythos also generated a significant amount of noise that still required human review, according to Cloudflare. The false-positive problem was more acute in projects written in memory-unsafe languages such as C and C++, where models were more likely to flag speculative issues.

The model also tended to over-report possible flaws, leaving security teams to sort tentative findings from genuine vulnerabilities. While Mythos improved output quality compared with earlier tools, Cloudflare said, it did not remove the cost of triage.

The issue is becoming more pressing as companies try to shorten the time between a vulnerability's disclosure and the deployment of a patch. Some security teams are now working to patch disclosed vulnerabilities in production within two hours, Cloudflare said, but speed alone will not solve the problem if testing and validation pipelines are not built to support it.

Harness Design

Rather than relying on a generic coding agent to inspect an entire repository, Cloudflare built a structured system around the model. It concluded that a single agent session could not provide meaningful coverage of large codebases because vulnerability research depends on testing many narrow hypotheses in parallel.

The approach begins with a reconnaissance stage that maps a repository, identifies trust boundaries and attack surfaces, and generates tasks for later stages. It then runs multiple concurrent hunting agents, each focused on a particular attack class and software scope, before a separate validation agent tries to disprove the findings.

Additional stages re-queue areas that were not examined thoroughly, remove duplicate findings and trace whether a flaw in a shared library is reachable from outside the system in consumer repositories. Cloudflare said that final tracing stage was the most important because it distinguishes a flaw in code from a vulnerability an attacker can actually reach.

This method improved both coverage and the quality of findings by narrowing the model's task and forcing independent review, Cloudflare said. The company also used Mythos to adapt and refine the harness itself.

Defensive Focus

Cloudflare said the broader lesson for security teams is that patching faster will not be enough if the surrounding architecture still leaves systems exposed while updates are prepared and tested. Organisations need defences that make software harder to exploit even when bugs exist, it argued, including controls in front of applications, limits on how far an attacker can move after a breach, and the ability to deploy fixes across environments at the same time.

Cloudflare framed the issue as both defensive and offensive. The same model behaviour that helped identify bugs internally could also make it easier for attackers to analyse code and develop exploits if similar systems become more widely available.

"The same capabilities that helped us find bugs in our own code will, in the wrong hands, accelerate the attack side against every application on the Internet," said Cloudflare.

Preferred Source

Anthropic model can chain bugs into exploits, Cloudflare

FinTech

Industry

MarTech

Infrastructure

Commerce

Enterprise

Cybersecurity

Telecomms