CommentaryGuest Commentary

AI-Enabled Vulnerability Discovery Is Reshaping National Cyber Defence

Achilles and Memnon fighting, Trojan War, Attic, red-figure krater.

Mythos: Achilles and Memnon fighting, Trojan War, Attic, red-figure krater. Image: history_docu_photo / Alamy Stock


The risk that AI-enabled vulnerability discovery tools will be misused is real. How can the UK minimise its strategic dependence and stay capable in cyber defence?

Anthropic’s new large language model (LLM), Claude Mythos, drew intense attention after demonstrating strong vulnerability-discovery capabilities. For example, in the widely used browser Firefox alone, the preview version of Claude Mythos reportedly helped identify 271 vulnerabilities. The severity levels of these 271 Firefox vulnerabilities were not disclosed, but earlier Claude Opus testing on Firefox found 22 security flaws, 14 of which were high-severity.

To put this into perspective, exploits targeting zero-day browser vulnerabilities – that is, code designed to abuse a previously unknown vulnerability in the browser to gain access to the device running it – can cost between a few hundred thousand and several million dollars on the exploit market. Furthermore, according to prior research by RAND, zero-day exploits have historically remained useful for an average of 6.9 years after private discovery. Finding, validating and weaponizing such vulnerabilities has typically required specialist human expertise and significant time, often taking months or years.

Now, new LLM-based systems are beginning to automate several parts of that process, which increases speed and reduces costs at the same time. They can help find vulnerabilities at scale, reason about exploitability and write exploit code. Beyond this, new agentic AI capabilities make it increasingly possible for LLM-based systems to move from identifying vulnerabilities to attempting exploitation with limited human guidance. In UK AI Security Institute tests, Mythos was able to carry out multi-stage attacks on vulnerable networks. Similarly, a recent Stanford study found that an agentic system could outperform most human professionals in certain types of penetration testing in live enterprise environments. As the Financial Times put it, the concern is that such tools could ‘turbocharge hacking’.

quote
If both attackers and defenders can use comparable tools to find vulnerabilities, the advantage will shift to whatever remaining areas still allow human operators to add value beyond AI, such as niche skills, cyber-physical operations, or insider threats

Claude Mythos should not be treated as an isolated breakthrough. OpenAI has already introduced GPT-5.4-Cyber, while China’s 360 Digital Security Group has reported significant progress with its own internally developed vulnerability discovery system. Together, these developments suggest the emergence not of a single model that can be contained, but of a capability class that is likely to diffuse and the response has been intense. Reuters and the Financial Times have reported on concerns about the implications for banks, financial regulators and the global banking system, while a Bloomberg UK opinion piece noted that the threat extends well beyond banking. One Financial Times piece likened the development to the building of the first atomic bomb. Speaking to the BBC, Canadian Finance Minister François-Philippe Champagne compared the uncertainty around Anthropic to the geopolitical crisis around the Strait of Hormuz, saying, ‘The difference is that the Strait of Hormuz – we know where it is and we know how large it is . . . the issue that we're facing with Anthropic is that it's the unknown-unknown.’ Similarly, in Foreign Affairs, Brianna Rosen and Jam Kraprayoon argued that autonomous AI cyber-agents could accelerate cyber operations to a speed and scale that may destabilise global security.

The Danger During the Transition

It is tempting to treat AI-enabled vulnerability discovery as a complete break from earlier cybersecurity practice, but the automation of cybersecurity tasks is not new and its most disruptive effects are often felt during the period before defenders adapt. Concerns about tools automating hacking and vulnerability analysis became commonplace in the 2000s, when user-friendly tools such as Metasploit, sqlmap and Burp Suite became widely available. Yet they did not make hacking effortless for everyone – they simply became part of the standard toolkit. Attackers used them, but serious defenders also incorporated them into their own security testing practices. This created a new baseline in which human skill still mattered, but increasingly on top of automated tooling.

The same logic is likely to apply here. For a period, this new class of AI tools will uncover large numbers of vulnerabilities, potentially exposing some organisations before they can respond. However, as such automation proliferates, defenders will also use these tools to find and address weaknesses. The competition between attackers and defenders will then move to whatever automation has not yet absorbed. In particular, if both attackers and defenders can use comparable tools to find vulnerabilities, the advantage will shift to whatever remaining areas still allow human operators to add value beyond AI, such as niche skills, cyber-physical operations, or insider threats. Human expertise will remain important not because it can substitute for AI-enabled capability, but because it becomes the differentiator above the new baseline.

Enjoy our analysis and research? Ensure it shows up first on Google

Help your search results show more from RUSI. Adding RUSI as a preferred source on Google means our analysis appears more prominently.

This also means that AI-enabled vulnerability discovery expands the existing requirements for national cyber defence. The UK still needs a well-educated cyber workforce, secure software development practices and institutions that embed security into everyday processes, but it will now also need to adopt and use these new AI tools effectively. In the short term, for many organisations – especially those that are not handling highly classified or extremely sensitive code – adoption will largely mean using tools developed by frontier AI firms. In the long term, however, especially for government, defence and critical national infrastructure, the question becomes how the UK avoids dependence while still using the best tools available.

Retaining Access, Reducing Dependence

On the one hand, because human cybersecurity experts cannot simply substitute for this new technology, access to it will become necessary to meet the new defensive baseline. On the other hand, because the most capable tools are likely to be provided through a relatively small number of external proprietary platforms, relying on them too heavily could create a new strategic dependency.

Such a dependency would come with several risks. Access to these tools could be affected by vendor policy, pricing, safety rules and export controls. More importantly, uses in defence, intelligence, critical national infrastructure, government and critical suppliers would require the tool to regularly process highly sensitive information, including source code and system schematics. In such cases, the provider’s own security, data handling and data sharing practices, insider threat controls, backup arrangements, ownership structure and any future merger or acquisition would all become matters of national security for the UK.

While the UK has to prevent unnecessary dependencies, it is also not realistic to build everything domestically. At this time, the UK does not have the spare capacity to compete directly with the largest players across the whole AI stack and current government plans suggest that ‘Sovereign AI compute will almost certainly be the smallest component of the UK’s overall compute portfolio’. The goal, therefore, should be to use finite sovereign capacity where dependence would be most problematic, while retaining access to the most capable tools available.

Subscribe to the Cyber & Tech Newsletter

Stay up to date with the latest publications and events from the Cyber and Tech Research Group

Subscribe to the RUSI Newsletter

Get a weekly round-up of the latest commentary and research straight into your inbox.

The UK, therefore, needs an ambidextrous strategy that uses the most capable proprietary tools to scale cyber defence rapidly, while building sovereign and open-source capability to process sensitive information and reduce dependency. External proprietary platforms are likely to remain ahead in some tasks and UK defenders should benefit from that performance where the sensitivity of the work allows it. At the same time, sovereign capability should be developed – especially by building on open-source frameworks – for tasks where external dependency creates strategic risk. One way to make this strategy more efficient would be to create a feedback loop between the two processes. More specifically, where possible, UK organisations using proprietary tools could be encouraged to also use sovereign tools and to share sanitised data on their relative performance with trusted public bodies, such as the National Cyber Security Centre, the AI Security Institute and the Department of Science, Innovation and Technology. This would help identify precisely where sovereign tools are falling behind (in other words, which classes of vulnerability are being missed and under which circumstances) and, thus, where public investment would have the greatest impact.

A further option would be to build a UK-controlled, model-agnostic cyber platform that can use different LLMs underneath it. For less sensitive tasks, it could route work primarily to the highest-performing proprietary systems. For highly sensitive code or classified data, it could use locally deployed open-source or sovereign models. Such a platform would have institutional advantages, including auditability, logging and control over data exposure. Because the routing and control layer would be developed domestically, it would also build national expertise in the practical use of AI-enabled cyber defence. In addition, it could support selective data-sharing strategies, so that no single external provider sees the whole picture. It would also make it easier to switch providers or fall back on domestic systems, if access to an external provider becomes unreliable.

Overall, most short-term risks of AI-enabled vulnerability discovery for the UK can be reduced through rapid defensive adoption. In the longer term, the main strategic task is to use finite sovereign capacity selectively so that the UK can benefit from these technologies without creating unnecessary strategic dependencies.

© Aybars Tuncdogan, 2026, published by RUSI with permission of the author.

The views expressed in this Commentary are the author's, and do not represent those of RUSI or any other institution.

For terms of use, see Website Terms and Conditions of Use.

Have an idea for a Commentary you'd like to write for us? Send a short pitch to commentaries@rusi.org and we'll get back to you if it fits into our research interests. View full guidelines for contributors.


WRITTEN BY

Dr Aybars Tuncdogan

Guest Contributor

View profile


Footnotes


Explore our related content