Developing a Framework for Secure Third-Party Access to Frontier AI
This report proposes a shared framework for understanding and managing these risks of frontier AI models.
A Note on Publication
The report will be available on the website in early May 2026. We are happy to provide an advanced copy in private from 27 April upon request. Please contact Dr Louise Marie Hurel (LouiseH@rusi.org).
Executive Summary
As frontier AI models expand in their capability and application, it is key that their evolution remains grounded in safety and security safeguards. Proactive safeguarding is essential to prevent them from being misused or repurposed by malign state or criminal actors to conduct cyber-attacks, terrorist attacks, and other harmful activities.Â
Third-party evaluation of frontier AI models is increasingly recognised as essential to safety and security—by developers, governments, and regulators alike. Yet enabling meaningful external evaluation requires granting access to some of the most sensitive intellectual property in tech/AI sector. The security risks associated with this access—from intellectual property leakage to model compromise to exploitation by state-sponsored actors—remain poorly mapped and inadequately standardised. This gap stifles the evaluation ecosystem—one where developers restrict access out of security concerns, while evaluators lack the information they need to conduct effective assessments.
This report proposes a shared framework for understanding and managing these risks. Drawing on the work of the Securing Access to Frontier AI Taskforce—which brought together evaluators, AI labs, cybersecurity experts, and AI governance researchers—it presents an Access-Risk Matrix that maps known cyber and information security threats to different levels of model access and matches them with appropriate countermeasures and stakeholder responsibilities. The aim is to move the conversation beyond the current tension between openness and security or security vs innovation, towards a practical, shared understanding of how to enable secure and effective third-party evaluation at scale.
The report makes three core contributions:
First, it develops a threat taxonomy of seven security risk categories specific to the context of third-party evaluation: Model Theft, Capability Reconstruction, Model Manipulation, Jailbreak/Safety Bypass, Accidental Exposure, Credential Compromise, and Access Persistence, with Weaponisation as potential resultant outcomes. Each category is defined, illustrated with real-world examples and research evidence, and situated within a risk hierarchy organised by access depth and actor sophistication.
Second, it proposes an Access-Risk Matrix that maps six types of evaluator access ranging from query-level inference to model internals, training data, evaluation data, and compute infrastructure, against each risk category. The matrix assigns indicative severity ratings for each of the access-related risks, thus providing a structured basis for identifying best-suited security mitigations based on threat model.
Third, and building on the Access-Risk Matrix, the report proposes security mitigations organised by each of the security risk categories that were previously identified. Mitigations are distinguished between technical, procedural, and contractual measures and assigns responsibility to developers, evaluators, or both jointly. These controls are grounded in established security principles (e.g., least privilege, assume breach, need-to-know, data minimisation, time-bound access, and proportionality) adapted for the specific context of frontier AI model evaluation.
Finally, the report proposes a shared governance framework for maturing the third-party evaluation ecosystem and presents it as four pillars of action for the multistakeholder community:Â
(1) ensuring that security concerns do not impede safety-critical evaluation;Â
(2) harmonising language and access tiers across the ecosystem;Â
(3) operationalising secure access through shared standards and practices; andÂ
(4) building feedback loops that allow the framework to evolve as the threat landscape, regulatory requirements, and evaluation methodologies develop.
About the Secure Access to Frontier AI Taskforce
As the rate of innovation and deployment of frontier AI models accelerates, ensuring secure and trusted third-party evaluations is a greater priority than ever. Third-party evaluators play invaluable roles in providing accountability, stress-testing and building vital trust in increasingly complex systems and applications. Ensuring development labs and evaluators are sensitised and equipped to actively mitigate potential risks is vital to delivering evaluations at scale, and safeguarding public and industry trust and safety. As part of this project, RUSI has established the Secure Access to Frontier AI (SAFA) Taskforce, drawing together a community of technical experts from labs, evaluators and cyber and information security experts from across sectors to provide an evaluation of third-party access.
Acknowledgements
The authors would like to thank all those who volunteered their time and expertise as part of the Taskforce. In particular, we would like to acknowledge the contributions of all SAFA-TF members and participants throughout the workshops as well as their feedback throughout the development of this report: Daniel Cuthbert, Mohamed Samy, Ehab Hussein, Omer Nevo, Alejandro Ortega, Kevin Klyman, Markus Anderljung, George Balston, Francesca Federico, Charles Foster, Esme Harrington, Pia Huesch, Kellin Pelrine, Adriana Stephan, Raquel Vazquez, Pegah Maham, Talita Dias, Rumman Chowdhury , Robert Trager, Conrad Stosz, Dawn Song, Abby Cruz, Mathias Vermeulen, Clement Briens, Markus Hobbhahn, Madeline Carr, Ingrid Dickinson and others that have provided comments to earlier versions of this report.
WRITTEN BY
Dr Louise Marie Hurel
Senior Research Fellow
Cyber and Tech
Elijah Glantz
Research Fellow
Organised Crime and Policing
Daniel Cuthbert
RUSI Associate Fellow, Cyber and Tech
- Jim McLeanMedia Relations Manager+44 (0)7917 373 069JimMc@rusi.org





