OpenAI launches AI safety hub

Instead of keeping safety data private, OpenAI now offers early user feedback phases and updated risk assessments after major model changes.

OpenAI has launched a public online hub to share internal safety evaluations of its AI models, aiming to increase transparency around harmful content, jailbreaks, and hallucination risks. The hub will be updated after major model changes, allowing the public to track progress in safety and reliability over time.

The move follows growing criticism about the company’s testing methods, especially after inappropriate ChatGPT responses surfaced in late 2023. Instead of waiting for backlash, OpenAI is now introducing an optional alpha testing phase, letting users provide feedback before wider model releases.

The hub also marks a departure from the company’s earlier stance on secrecy. In 2019, OpenAI withheld GPT-2 over misuse concerns. Since then, it has shifted towards transparency by forming safety-focused teams and responding to calls for open safety metrics.

OpenAI’s approach appears timely, as several countries are building AI Safety Institutes to evaluate models before launch. Instead of relying on private sector efforts alone, the global landscape now reflects a multi-stakeholder push to create stronger safety standards and governance for advanced AI.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!