Mythos AI Cybersecurity Concerns

Ironically, Anthropic's cybersecurity AI tool, Mythos, has triggered concern about the tool falling into the wrong hands, just as news of a breach emerges.

Mythos AI Cybersecurity Concerns On April 7th, 2026, Anthropic announced that their Mythos model, a powerful artificial intelligence (AI) cybersecurity tool, would no longer be released to the public as originally planned, but instead shared with key organizations and partners as part of Project Glasswing. The change in plans is due to something Anthropic itself advertised when first sharing about the Mythos tool - the sheer power of the program. On their website, Anthropic explains that “Claude Mythos Preview demonstrates a leap in these cyber skills—the vulnerabilities it has spotted have in some cases survived decades of human review and millions of automated security tests, and the exploits it develops are increasingly sophisticated.”

Mythos’ reported ability supersedes just identifying hard to locate vulnerabilities. Scientific American shared that out of the “thousands of high-severity vulnerabilities, including some in every major operating system and web browser” that Anthropic said Mythos discovered, 99% had not yet been patched. The United Kingdom’s AI Security Institute (AISI) evaluated the program and discovered that it could complete expert-level hacking assignments almost three-quarters (73%) of the time. The model’s 31 point gain on their previous AI model’s score on the USAMO 2026 Math Olympiad also proves that the “leap” in cyber skills has not been exaggerated.

Peter Swire, professor at the School of Cybersecurity and Privacy at the Georgia Institute of Technology, addressed the most obvious danger that Mythos posed after Anthropic’s announcement: “One risk after Mythos is that it will be easier to turn a vulnerability, a known flaw, into an exploit, something that somebody actually takes advantage of. Every cybersecurity defender should take Mythos seriously, but the expected harm to defense is likely to be far lower than the worst-case scenarios would suggest.”

This optimism is likely also held by the Project Glasswing partners who will be the first to implement the Mythos model “in an effort to secure the world’s most critical software.” Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JP MorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto are the launch partners, whose considerable sway in the tech space makes their interest in stronger cybersecurity unsurprising. It is also unsurprising that several government agencies have also taken an interest in the model with the goal of finding vulnerabilities.

Despite Anthropic’s last feud with the Department of War over mass surveillance and AI-automated weapons resulting in the company being blacklisted by the Secretary of State, the Commerce Department and National Security Agency (NSA) are reportedly using their latest tool. Anthropic’s blog claims that there are “ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities”, since “securing critical infrastructure is a top national security priority for democratic countries… the US and its allies must maintain a decisive lead in AI technology.”

This push for US AI leadership is in line with the current administration’s stance on AI as Trump has pushed for “unchallenged global technological dominance.” Still, this drive for dominance does not seem to be government-wide. The Cybersecurity and Infrastructure Security Agency (CISA) is unlikely to join the Commerce Department and the NSA in getting the Mythos Preview, despite being the nation’s top cyber agency. Considering the continued cuts that CISA endures under Trump’s presidency, the lack of additional support afforded to other agencies is somewhat unfortunate but expected.

Now, news of a potential Mythos breach has many concerned. This week, Bloomberg reported that a group of unauthorized users, not included in Project Glasswing, were able to access the tool, triggering an investigation by Anthropic. While it’s not yet clear if this is an access dispute or a true hack, the intrusion has brought up fears of the AI falling into the wrong hands. In their own blog, Anthropic warned that the rapid developments in AI, not unlike the creation of the Mythos model, means “it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout - for economies, public safety, and national security - could be severe.”

Anthropic already has a history of their AI tools being exploited for malicious exploits. In February 2026, a hacker “jailbroke” the company’s AI chatbot, Claude, by convincing the bot to “act as an elite hacker, finding vulnerabilities in government networks, writing computer scripts to exploit them and determining ways to automate data theft”. This bypassed the safety guidelines built into the program, and gave the attacker thousands of ready-to-execute plans along with what credentials to use. Using this information, the hacker was able to steal 150 GB of data from several Mexican government agencies, including voter records, civil registry files, and taxpayer records. While the Claude AI hack was a clear demonstration of AI’s ability to help cyber criminals, the threat that AI has posed for cybersecurity has been researched for some time, with the UK’s National Cyber Security Centre even publishing a report on the danger in 2024.

For now, the US government seems more concerned with stopping AI knockoffs from preventing complete US leadership in AI. On April 23rd, 2026, the Assistant to the President for Science and Technology Director, Michael J. Kratsios, released a memorandum on “Adversarial Distillation of American AI models”. The Center for Financial Inclusion describes AI distillation as “a process where a relatively small and resource-efficient model (often called the ‘student model’) learns to mimic a much larger and more complex and powerful LLM (often called the ‘teacher model’)”.

In the memo, Kratsios explains how this legitimate practice can be abused by foreign actors “principally based in China” to make AI “products that appear to perform comparably on select benchmarks at a fraction of the cost.” One example would be the R1 chatbot from Chinese AI company DeepSeek, that may have been a distillation of Open AI’s o1 model. The bot was able to compete with more well-known AI companies at a lower price, and the release dropped the stocks of Western tech companies. Adding to US concerns, the smaller AI models may not have the same safety measures and may also be less “ideologically neutral and truth-seeking.”

In response to these concerns, the Trump Administration says it will work with AI companies to alert them of any foreign distillation attempts, and weigh means “to hold foreign actors accountable” for their campaigns. If the unauthorized users behind the Mythos AI breach are foreign-based actors looking to distill the program, they may end up facing a high penalty for their actions.