CommentaryGuest Commentary

Mass Criticality: Rethinking Critical Infrastructure in the UK

An AI generated stock image of a microprocessor that is being hacked.

Common threats: The ubiquity of individual software providers makes their vulnerabilities widespread. Image: James Thew / Adobe Stock


As digital systems become more interdependent and concentrated, a growing number of ‘ordinary’ components quietly turn into de facto critical infrastructure.

MI6’s new chief, Blaise Metreweli, recently warned that the ‘front line is everywhere’. In a separate speech, the Chief of the Defence Staff, Sir Richard Knighton, argued that ‘our whole nation’ must step up. Taken together, these remarks signal that the UK is now operating in a hybrid threat landscape – ‘a space between peace and war’, as Metreweli put it – where remote cyber operations and local cyber-physical attacks, including some that leverage insider access, sit alongside more traditional military risks.

In that hybrid threat landscape, adversaries are likely to reach first for tools that disrupt daily life and economic activity rather than for conventional warfare, which tends to sit higher on the escalation ladder. Yet our current definitions of critical infrastructure, still focused mainly on utilities such as electricity grids and power plants, do not adequately account for the many other bottlenecks that can now be exploited to disrupt society – some of which are so fragile that significant disruptions can occur even without malicious activity.

Strikes to Ubiquitous Software

On 18 November 2025, people around the world found that much of the internet had suddenly stopped working. Social networks like X (formerly Twitter) went dark, AI services like ChatGPT were inaccessible, and streaming platforms like Spotify didn’t work. The websites of government bodies, hospitals, banks and media organisations were among the countless systems that stalled. This was neither the result of an advanced computer worm like Stuxnet nor an international cyberwar. An automatically generated configuration file in the systems of a single organisation – Cloudflare – triggered a crash. A simple mistake by one of many automated processes in a complex ecosystem was enough to take down a sizeable part of the internet that depended on that single company.

One might be tempted to think of this as a freak event – a rare alignment of unlikely factors. Unfortunately, it was not. Less than three weeks later, on 5 December 2025, another Cloudflare outage hit. This time, around 28% of Cloudflare’s traffic was affected but, given the company’s scale, the impact was still substantial, taking out systems across several sectors, including Shopify, Zoom and LinkedIn. A critical flaw in a widely used toolkit for building web user interfaces, React, forced Cloudflare to roll out an emergency mitigation, which then triggered a latent bug in its own software. As Cloudflare’s chief technical officer, Dane Knecht, put it, ‘we know we have let the Internet down again’.

Cloudflare is hardly alone. The day before that second outage, on 4 December 2025, reports emerged that a ransomware breach at a software supplier called Marquis had affected more than 70 US banks and credit unions. About a month earlier, an Amazon Web Services outage took out a significant part of the internet. Last year, a single mistaken update by cybersecurity company CrowdStrike crashed about 8.5 million computersDelta Air Lines alone allegedly lost close to $500 million. The digital infrastructure that powers today’s economies has become so tightly interdependent that a single malicious attack – or a single unintentional mistake – at one company can now wreak havoc across sectors.

quote
Even seemingly ordinary products and services – such as a toolkit like React or a behind-the-scenes software supplier like Marquis – are turning into de facto critical infrastructure

In other cases, the problem is less about how many systems depend on a component and more about how much is concentrated in one place, as in the recent Coupang breach in South Korea. After a former senior engineer’s access was reportedly not fully revoked, personal data from 33.7 million customer accounts, representing approximately two-thirds of the country’s population, was exposed. Because so much information was concentrated in a single organisation – Coupang is commonly described as the ‘Amazon of Korea’ – compromising one corporate system was enough to turn a routine access-management failure into a national incident.

This is what I call the ‘mass criticality’ problem: Even seemingly ordinary products and services – such as a toolkit like React or a behind-the-scenes software supplier like Marquis – are turning into de facto critical infrastructure. Mass criticality arises when digital systems are both highly interdependent and highly concentrated in the hands of a few providers. When one of these components is hacked or suffers an outage, the effects can propagate widely across sectors and borders.

The problem is that we are still treating each of these incidents as if they were isolated cases. When Cloudflare’s chief technical officer says ‘we know we have let the Internet down again’, he is blaming himself and his team for the incident. However, the key issue is not that Cloudflare had bugs or that React had a critical vulnerability – those will keep appearing, no matter how many source code audits, red-team exercises and bug-hunting campaigns you run. The deeper problem is that a fault at Cloudflare can take down a significant part of the internet, that a flaw in React can in turn destabilise Cloudflare, and that failures in other widely shared components can destabilise React itself. We have, in the UK and across the world, ended up with a digital ecosystem full of single points of critical failure.

This highlights a mismatch with the way critical national infrastructure is largely framed in the UK, which remains geared towards a collection of utilities and physical assets rather than a network of shared digital dependencies and a small number of highly concentrated digital services. In particular, while critical infrastructure inventories are crucial, they do not adequately capture the systemic risk arising from failures at highly concentrated providers on which many other sectors – including hospitals, financial institutions and government services – depend.

Critical Third Parties

Dealing with this challenge requires policymakers to move on several fronts, including setting expectations for key digital providers and the organisations that rely on them. First, policymakers need to map how essential services depend on digital intermediaries such as cloud platforms, identity providers and security tools. In cases where essential services are heavily reliant on intermediaries, or the market is highly concentrated around one or a few providers, some of these providers could be assigned to an appropriate tier in a layered framework for critical infrastructure. One existing step in this direction is the UK financial sector’s statutory framework for ‘critical third parties’, which recognises that failures at certain technology providers can pose systemic risk. However, extending this logic to the wider economy would require rethinking what counts as ‘critical national infrastructure’, since many of these digital services are hosted and operated across borders.

Subscribe to the RUSI Newsletter

Get a weekly round-up of the latest commentary and research straight into your inbox.

Second, policymakers should consider developing standard approaches for the reporting of digital dependencies and taking steps to limit concentration – the lack of technological diversity – at multiple levels of an organisation’s technology stack. For example, in the 2024 CrowdStrike incident, millions of Windows machines crashed while Linux systems continued to operate. Earlier that year, a separate issue with the same software caused outages on servers running certain Linux distributions, while other systems – including Windows machines – were not affected. It will not always be possible to avoid reliance on a single service at a given level – for instance, an organisation may need to use the same endpoint security product across most of its network – but it is often possible to mitigate at least some of the cascading effects by diversifying at other levels. A standardised way of reporting which components are critical at each layer of the technology stack would allow policymakers to encourage organisations to diversify their architectures – for example, by running backup systems on a different cloud platform or operating system.

Finally, policymakers should consider embedding mass criticality into how they test the resilience of national systems. This means that, beyond routine risk checks, they should run regular tabletop exercises and cross-sector simulations that assume the temporary loss, degradation or breach of a highly concentrated or systemically important digital provider, such as a key cloud platform or identity provider. For example, they can examine the extent to which the UK’s essential services remain functional if a key identity provider or cloud service is hacked, suffers a major outage or is otherwise no longer reachable. Likewise, it is important to understand whether there is a short chain of critical components that, if hacked or rendered non-functional – for instance by an adversary leveraging pre-positioned access built up over time – could cripple significant parts of the system.

Metreweli’s warning that the ‘front line is everywhere’ also describes the UK’s position in the digital domain. In a world of growing digital dependency and concentration, systemic risk now runs through cloud platforms, identity providers and security tools as much as through power plants, water treatment facilities and electricity grids. This makes it essential to update how we define critical infrastructure, how we map dependencies and how we test for failure, so that mass criticality in digital systems becomes a managed risk rather than a hidden systemic vulnerability. Taking these steps would not prevent every outage or breach but would help to limit the damage when such incidents occur and reduce the scope for adversaries to turn shared digital components into tools of coercion or conflict.

© Dr Aybars Tuncdogan, 2026, published by RUSI with permission of the author.

The views expressed in this Commentary are the author's, and do not represent those of RUSI or any other institution.

For terms of use, see Website Terms and Conditions of Use.

Have an idea for a Commentary you'd like to write for us? Send a short pitch to commentaries@rusi.org and we'll get back to you if it fits into our research interests. View full guidelines for contributors.


WRITTEN BY

Dr Aybars Tuncdogan

Guest Contributor

View profile


Footnotes


Explore our related content