CAISI expands frontier AI testing with Google DeepMind, Microsoft and xAI

New CAISI agreements will support pre-deployment AI testing, post-deployment assessments and targeted security research.

NIST CAISI graphic illustrating new frontier AI testing agreements with Google DeepMind, Microsoft, and xAI

The Center for AI Standards and Innovation (CAISI), part of the US National Institute of Standards and Technology (NIST), has announced new agreements with Google DeepMind, Microsoft, and xAI to expand government evaluations of frontier AI models and support research on AI security.

According to the announcement, the agreements will support pre-deployment evaluations and targeted research intended to improve understanding of frontier AI capabilities and their national security implications.

CAISI says the updated arrangements build on earlier partnerships that were renegotiated to reflect directives from the Secretary of Commerce and the US AI Action Plan.

CAISI also says it has been designated to serve as the main point of contact within the US government for collaboration with industry on testing, joint research, and best-practice development for commercial AI systems. To date, it says it has completed more than 40 evaluations, including assessments of advanced unreleased models.

CAISI Director Chris Fall said independent and rigorous measurement is essential to understanding frontier AI and its national security implications. The announcement adds that the agreements are intended to support information-sharing, voluntary product improvements, and a clearer government understanding of AI capabilities and international AI competition.

The agency notes that developers often provide models with reduced or removed safeguards to support national security-related testing. It also says evaluators from across government may participate through the CAISI-convened TRAINS Taskforce, and that the agreements are designed to support testing in classified environments and to adapt to continued advances in AI.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!