“After a month, most partners have each found hundreds of critical or high-severity vulnerabilities”: Anthropic claims that Mythos has discovered more than ten thousand major security vulnerabilities in “the world’s most systemically important software.”

Anthropic reported that Mythos Preview discovered more than 10,000 high and critical severity vulnerabilities in less than two months, with Cloudflare alone discovering 2,000 bugs.
Independent validation confirmed 90% of the evaluated results as real, although critics say this breakthrough may have come from massive computation and workflows rather than single reasoning.
The bottleneck has shifted from discovery to verification, disclosure and patching as AI now surfaces vulnerabilities faster than organizations can remediate them.

In less than two months, Anthropic and friends have apparently discovered over ten thousand critical and high-profile security vulnerabilities using the popular artificial intelligence tool Mythos Preview.

In a brief update on the status of the project, released late last week, Anthropic said that since the tool’s release, about 50 organizations that have used it have each found “hundreds” of vulnerabilities.

“Several told us their bug detection rate increased more than tenfold,” the company said. “For example, Cloudflare found 2,000 bugs (400 of which were high or critical severity) on its critical path systems, with a false positive rate that the Cloudflare team considers better than that of human testers.

Skepticism persists

Of these, 1,752 were assessed by independent security researchers, and 90% were confirmed as valid positives, while 62.4% were confirmed as high or critical severity.

But while the overall reaction to Mythos Preview has been overwhelmingly positive, some voices say the hype may also be overblown. Techzine analysisfor example, argues that AI-assisted vulnerability discovery already existed thanks to systems like Google’s Big Sleep, and that the real challenge remains human operational security.

A recent academic article, “Comparative analysis of myth-related bug rediscovery“, found that under controlled conditions, public boundary models like GPT-5.5 were able to rediscover some of the same vulnerabilities attributed to Mythos, and on Redditdifferent communities were even more skeptical. The key takeaway seems to be that Mythos may simply be using huge amounts of computation and long-running agent workflows rather than possessing qualitatively different reasoning capabilities.

Regardless, Anthropic now says that progress in software vulnerabilities is no longer limited by the speed of discovery, but rather the speed of verification, disclosure, and patching.