New AI safety policies target teen protection in apps

The latest OpenAI prompt-based safety policies aim to help developers build AI systems that better protect teenagers from harmful or inappropriate content.

OpenAI has released a set of prompt-based safety policies to help developers build safer AI experiences for teenagers.

OpenAI has released a set of prompt-based safety policies to help developers build safer AI experiences for teenagers. The tools work with the open-weight model gpt-oss-safeguard, turning safety requirements into practical classifiers for real-world use.

The policies address teen risks, including graphic violence, sexual content, harmful body image behaviour, dangerous challenges, roleplay, and age-restricted goods and services. Developers can use them for both real-time filtering and offline content analysis.

The framework was developed with input from organisations such as Common Sense Media and everyone.ai to improve clarity and consistency in teen safety rules. The initiative also responds to long-standing challenges in translating high-level safety goals into precise operational systems.

Open-source availability through the ROOST Model Community allows developers to adapt and expand the policies for different use cases and languages. The framework is a foundational step, not a complete solution, encouraging layered safeguards and ongoing refinement.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot