Developing a Framework for Secure Third-Party Access to Frontier AI

Walls of data servers

Image generated with Canva AI


This report proposes a shared framework for understanding and managing these risks of frontier AI models.

A Note on Publication

The report will be available on the website in early May 2026. We are happy to provide an advanced copy in private from 27 April upon request. Please contact Dr Louise Marie Hurel (LouiseH@rusi.org).

Executive Summary

As frontier AI models expand in their capability and application, it is key that their evolution remains grounded in safety and security safeguards. Proactive safeguarding is essential to prevent them from being misused or repurposed by malign state or criminal actors to conduct cyber-attacks, terrorist attacks, and other harmful activities. 

Third-party evaluation of frontier AI models is increasingly recognised as essential to safety and security—by developers, governments, and regulators alike. Yet enabling meaningful external evaluation requires granting access to some of the most sensitive intellectual property in tech/AI sector. The security risks associated with this access—from intellectual property leakage to model compromise to exploitation by state-sponsored actors—remain poorly mapped and inadequately standardised. This gap stifles the evaluation ecosystem—one where developers restrict access out of security concerns, while evaluators lack the information they need to conduct effective assessments.

This report proposes a shared framework for understanding and managing these risks. Drawing on the work of the Securing Access to Frontier AI Taskforce—which brought together evaluators, AI labs, cybersecurity experts, and AI governance researchers—it presents an Access-Risk Matrix that maps known cyber and information security threats to different levels of model access and matches them with appropriate countermeasures and stakeholder responsibilities. The aim is to move the conversation beyond the current tension between openness and security or security vs innovation, towards a practical, shared understanding of how to enable secure and effective third-party evaluation at scale.

The report makes three core contributions:

First, it develops a threat taxonomy of seven security risk categories specific to the context of third-party evaluation: Model Theft, Capability Reconstruction, Model Manipulation, Jailbreak/Safety Bypass, Accidental Exposure, Credential Compromise, and Access Persistence, with Weaponisation as potential resultant outcomes. Each category is defined, illustrated with real-world examples and research evidence, and situated within a risk hierarchy organised by access depth and actor sophistication.

Second, it proposes an Access-Risk Matrix that maps six types of evaluator access ranging from query-level inference to model internals, training data, evaluation data, and compute infrastructure, against each risk category. The matrix assigns indicative severity ratings for each of the access-related risks, thus providing a structured basis for identifying best-suited security mitigations based on threat model.

Third, and building on the Access-Risk Matrix, the report proposes security mitigations organised by each of the security risk categories that were previously identified. Mitigations are distinguished between technical, procedural, and contractual measures and assigns responsibility to developers, evaluators, or both jointly. These controls are grounded in established security principles (e.g., least privilege, assume breach, need-to-know, data minimisation, time-bound access, and proportionality) adapted for the specific context of frontier AI model evaluation.

Finally, the report proposes a shared governance framework for maturing the third-party evaluation ecosystem and presents it as four pillars of action for the multistakeholder community: 
(1) ensuring that security concerns do not impede safety-critical evaluation; 
(2) harmonising language and access tiers across the ecosystem; 
(3) operationalising secure access through shared standards and practices; and 
(4) building feedback loops that allow the framework to evolve as the threat landscape, regulatory requirements, and evaluation methodologies develop.

About the Secure Access to Frontier AI Taskforce

As the rate of innovation and deployment of frontier AI models accelerates, ensuring secure and trusted third-party evaluations is a greater priority than ever. Third-party evaluators play invaluable roles in providing accountability, stress-testing and building vital trust in increasingly complex systems and applications. Ensuring development labs and evaluators are sensitised and equipped to actively mitigate potential risks is vital to delivering evaluations at scale, and safeguarding public and industry trust and safety. As part of this project, RUSI has established the Secure Access to Frontier AI (SAFA) Taskforce, drawing together a community of technical experts from labs, evaluators and cyber and information security experts from across sectors to provide an evaluation of third-party access.

Secure Access to Frontier AI Taskforce

This project taskforce explores the cyber security, information security and model-level risks associated with third-party access to frontier AI models.

Acknowledgements

The authors would like to thank all those who volunteered their time and expertise as part of the Taskforce. In particular, we would like to acknowledge the contributions of all SAFA-TF members and participants throughout the workshops as well as their feedback throughout the development of this report: Daniel Cuthbert, Mohamed Samy, Ehab Hussein, Omer Nevo, Alejandro Ortega, Kevin Klyman, Markus Anderljung, George Balston, Francesca Federico, Charles Foster, Esme Harrington, Pia Huesch, Kellin Pelrine, Adriana Stephan, Raquel Vazquez, Pegah Maham, Talita Dias, Rumman Chowdhury , Robert Trager, Conrad Stosz, Dawn Song, Abby Cruz, Mathias Vermeulen, Clement Briens, Markus Hobbhahn, Madeline Carr, Ingrid Dickinson and others that have provided comments to earlier versions of this report.


WRITTEN BY

Dr Louise Marie Hurel

Senior Research Fellow

Cyber and Tech

View profile

Elijah Glantz

Research Fellow

Organised Crime and Policing

View profile

Daniel Cuthbert

RUSI Associate Fellow, Cyber and Tech

View profile


Footnotes


Explore our related content