Experts propose frameworks for trustworthy AI systems

Researchers outline emerging priorities for ensuring artificial intelligence systems are safe, reliable and aligned with human values as deployment expands across society.

AI safety, model quality, robustness, explainability, fairness, bias mitigation, governance, interdisciplinary research

A coalition of researchers and experts has identified future research directions aimed at enhancing AI safety, robustness and quality as systems are increasingly integrated into critical functions.

The work highlights the need for improved tools to evaluate, verify and monitor AI behaviour across diverse real-world contexts, including methods to detect harmful outputs, mitigate bias and ensure consistent performance under uncertainty.

The discussion emphasises that technical quality attributes such as reliability, explainability, fairness and alignment with human values should be core areas of focus, especially for high-stakes applications in healthcare, transport, finance and public services.

Researchers advocate for interdisciplinary approaches, combining insights from computer science, ethics, and the social sciences to address systemic risks and to design governance frameworks that balance innovation with public trust.

The article also notes emerging strategies such as formal verification techniques, benchmarks for robustness and continuous post-deployment auditing, which could help contain unintended consequences and improve the safety of AI models before and after deployment at scale.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!