Google DeepMind updates AI safety framework for advanced risks
The latest Frontier Safety Framework addresses AI risks including harmful manipulation and misalignment, aiming to protect operators and high-stakes contexts.

A leading AI developer has released the third iteration of its Frontier Safety Framework (FSF), aiming to identify and mitigate severe risks from advanced AI models. The update expands risk domains and refines the process for assessing potential threats.
Key changes include the introduction of a Critical Capability Level (CCL) focused on harmful manipulation. The update targets AI models with the potential to systematically influence beliefs and behaviours in high-stakes contexts, ensuring safety measures keep pace with growing model capabilities.
The framework also enhances protocols for misalignment risks, addressing scenarios where AI could override operators’ control or shutdown attempts. Safety case reviews are now conducted before external launches and large-scale internal deployments reach critical thresholds.
The updated FSF sharpens risk assessments and applies safety and security mitigations in proportion to threat severity. It reflects a commitment to evidence-based AI governance, expert collaboration, and ensuring AI benefits humanity while minimising risks.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!