UK evaluates frontier AI for operational cybersecurity applications

A new UK pilot demonstrated how AI can support cyber teams in finding critical weaknesses.

UK cyber teams used frontier AI to identify vulnerabilities across government repositories.

The UK Government Cyber Coordination Centre (GC3), in partnership with the National Cyber Security Centre (NCSC) and the AI Security Institute, has completed a pilot programme exploring how frontier AI models could strengthen cyber defence across government systems.

The initiative forms part of the UK’s Government Cyber Action Plan, which seeks to improve public-sector cyber resilience through the use of emerging technologies.

Teams participated in a series of hackathons that used advanced AI systems to analyse public government code repositories for potential security weaknesses.

Different approaches were tested, including multi-agent workflows, AI-assisted vulnerability investigation and specialised AI skills designed to automate parts of the security auditing process. Rather than relying on a single methodology, participants tested different architectures and workflows to determine which approaches produced the most effective results.

The exercise identified 407 security findings, including vulnerabilities that could have enabled authentication bypass, data exposure and remote code execution. AI models demonstrated an ability to identify relationships between technical weaknesses across multiple services and uncover attack paths that conventional scanners often struggle to detect.

Government departments validated the findings through existing security processes and remediated all critical vulnerabilities.

UK officials concluded that successful deployment depends less on the choice of AI model and more on how AI is integrated into structured security workflows. Human experts remained responsible for validating findings, prioritising risks and managing remediation efforts.

Following the results, GC3 plans to launch a second phase involving additional government departments, more AI systems and assessments of closed-source environments.

Why does it matter?

The pilot provides a practical example of how frontier AI systems can be used in operational cybersecurity rather than solely for research or experimentation. As governments and organisations face increasingly complex cyber threats, AI tools could help security teams identify vulnerabilities more quickly and uncover attack paths that traditional automated tools may miss.

The findings also reinforce the importance of human oversight in AI-enabled security operations. While AI can assist with vulnerability discovery and analysis at scale, expert validation and risk management remain essential. The project highlights a growing trend towards combining AI capabilities with human expertise to improve cyber resilience across critical systems and public-sector infrastructure.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!