Can we test for trust? The verification challenge in AI
10 Jul 2025 10:10h - 10:40h
Can we test for trust? The verification challenge in AI
Session at a glance
Summary
This panel discussion, titled “Can we test for trust? The verification challenge in AI,” brought together experts from various sectors to address the complex challenges of testing and verifying AI systems for trustworthiness. The conversation emerged from intensive multi-stakeholder discussions involving representatives from all continents, focusing on identifying gaps in current AI testing ecosystems and exploring collaborative solutions.
A central theme was the need for more inclusive and globally representative approaches to AI testing and standards development. Rachel Adams emphasized that historically, standard-setting processes have excluded developing countries, and current pre-deployment testing is concentrated in resource-rich regions. She argued for multistakeholder, multidisciplinary approaches that incorporate diverse worldviews and local realities, noting that AI systems fail differently across various global contexts.
Boulbaba Ben Amor stressed the critical importance of benchmarking, calling for increased focus on testing AI products and solutions rather than just models. He highlighted the need for industry-specific benchmarks across healthcare, education, defense, and other sectors, particularly as AI evolves toward more autonomous “agentic” solutions that make decisions without human intervention.
Anja Kaspersen addressed the terminology problem in AI governance, noting that technical and policy communities often use the same terms with different meanings. She emphasized that trust is not a machine property but rather how societies navigate uncertainty, and warned against reducing complex issues to broad slogans while advocating for verification as an adaptive, ongoing process.
Chris Painter discussed the development of “frontier safety policies” by AI companies, which establish specific evaluations and thresholds for dangerous capabilities. However, he noted these remain voluntary and unstandardized across the industry.
Roman Yampolskiy presented a more pessimistic view, describing the state of AI alignment and control science as “mostly non-existent.” He argued that while AI capabilities advance exponentially, safety progress remains linear at best, and that testing paradigms break down when dealing with systems potentially smarter than humans.
The discussion revealed significant gaps in current approaches, with experts calling for better benchmarking, inclusive global participation, clearer terminology, and recognition that current testing methods may be insufficient for future AI capabilities.
Keypoints
## Major Discussion Points:
– **Global Inclusivity in AI Testing and Standards**: The need to move beyond current AI testing processes that are concentrated in resource-rich regions and ensure meaningful participation from developing countries, particularly Africa, in creating evaluation benchmarks that reflect diverse local realities and worldviews.
– **The Evolution from Model Testing to Product Testing**: The urgent need to shift focus from evaluating AI models in isolation to testing complete AI solutions, products, and features that end users will actually interact with, including the safety and security systems built around primary models.
– **Terminology and Communication Gaps**: The critical importance of establishing clear, shared definitions between policymakers and technologists, as technical terms often have different meanings in policy contexts, which hampers effective verification efforts and governance.
– **Dangerous Capability Evaluations and Frontier Safety**: The development of testing frameworks by AI companies to identify when models develop concerning capabilities (like assisting with weapons creation or cyber attacks) and the establishment of conditional “red lines” that would trigger additional safety measures.
– **Fundamental Limitations of Current AI Safety and Control**: The sobering reality that alignment with human values remains poorly defined and largely unsolved, with testing paradigms inadequate for systems that may become smarter than humans and capable of unforeseen actions.
## Overall Purpose:
The discussion aimed to examine the current state of AI verification and testing capabilities, identify gaps in the global ecosystem for AI governance, and explore solutions for building more trustworthy AI systems through international collaboration, standardization, and inclusive testing practices.
## Overall Tone:
The discussion maintained a serious, academic tone throughout, with participants demonstrating both expertise and concern about the challenges ahead. While collaborative and constructive, there was an underlying urgency about the pace of AI development outstripping governance capabilities. The tone became notably more sobering toward the end, particularly with Professor Yampolskiy’s stark assessment that AI safety and control capabilities are “mostly non-existent,” shifting from cautious optimism about solutions to frank acknowledgment of fundamental limitations.
Speakers
– **Robert Trager** – Panel moderator/host
– **Rachel Adams** – Founder and CEO of the Global Center on AI Governance, an Africa-based organization advancing equitable AI governance worldwide
– **Boulbaba Ben Amor** – Director of AI for Good at Inception, a G42 company based in Abu Dhabi, creating industry-specific AI products built to solve real-world challenges
– **Anja Kaspersen** – Director for Global Markets Development, New Frontiers, and Emerging Spaces at IEEE, a global charity and the world’s largest technical professional organization
– **Chris Painter** – Director of Policy at Meter, a non-profit organization studying AI capabilities and the risks they may pose
– **Roman V. Yampolskiy** – Professor at the University of Louisville, whose work on AI safety and cybersecurity has highlighted existential risk posed by advanced AI
Additional speakers:
No additional speakers were identified beyond those in the provided speakers names list.
Full session report
# Can We Test for Trust? The Verification Challenge in AI – Panel Discussion Report
## Executive Summary
This panel discussion, moderated by Robert Trager, brought together five experts to examine the challenges of testing and verifying AI systems for trustworthiness. The conversation emerged from intensive multi-stakeholder discussions involving representatives from all continents, focusing on identifying gaps in current AI testing ecosystems. The panelists shared diverse perspectives on global inclusivity in AI testing, the evolution from model to product testing, terminology challenges, frontier safety policies, and fundamental limitations in current AI safety approaches.
## Key Participants and Their Perspectives
**Rachel Adams**, Founder and CEO of the Global Centre on AI Governance, emphasized the need for inclusive, globally representative approaches to AI testing, particularly highlighting African perspectives often excluded from standard-setting processes.
**Boulbaba Ben Amor**, Director of AI for Good at Inception, stressed the importance of shifting focus from model testing to product testing and developing industry-specific benchmarks.
**Anja Kaspersen**, Director for Global Markets Development at IEEE, addressed the role of professional organizations in AI governance and the need for verification as an ongoing process.
**Chris Painter**, Director of Policy at Meter, discussed emerging frontier safety policies and dangerous capability evaluations being developed by AI companies.
**Roman V. Yampolskiy**, Professor at the University of Louisville, presented a critical assessment of the fundamental limitations in current AI safety and control science.
## Major Discussion Points
### Global Inclusivity and Representation in AI Testing
Rachel Adams opened by highlighting the systematic exclusion of developing countries, particularly African nations, from AI standard-setting processes. She noted that “very often it’s the people that are designing and developing the models that are designing and developing the benchmarks against which the performance of these models are being tested,” creating inherent bias in testing approaches.
Adams emphasized that current testing paradigms fail to account for how AI systems perform across diverse global contexts, arguing that “how these technologies fail in different parts of the world is a really attenuated concern and it’s not a concern that can be answered by one group of people with one distinct kind of worldview.” She advocated for multistakeholder, multidisciplinary approaches that incorporate diverse perspectives and local realities.
### The Shift from Model Testing to Product Testing
Boulbaba Ben Amor introduced a critical distinction between evaluating AI models in isolation versus testing complete AI solutions that end users actually interact with. “While the major effort is turned to evaluate and benchmark AI models,” he argued, “we need to move as quick as possible to evaluate AI solutions, AI products, and AI features.”
Ben Amor stressed that different industries require distinct benchmarking approaches, with healthcare, education, defense, finance, and other sectors each needing specific testing methodologies. He also raised concerns about emerging “agentic” AI solutions that operate with greater autonomy, making decisions without human intervention.
### Professional Organizations and Verification Processes
Anja Kaspersen discussed the role of technical professional organizations like IEEE in AI governance conversations. She emphasized verification as an adaptive, ongoing process rather than a one-time assessment, noting the importance of bringing technical expertise into policy discussions.
Kaspersen made a fundamental distinction about trust: “Trust is not a property of machines. It is how institutions and societies navigate uncertainty.” She also highlighted the need for conceptual clarity, noting that “robustness under stress is not fairness. Privacy is not traceability. Resilience is not transparency.”
### Frontier Safety Policies and Dangerous Capability Evaluations
Chris Painter introduced the concept of “frontier safety policies” being developed by AI companies to identify when models develop concerning capabilities. These policies establish specific evaluations and thresholds for dangerous capabilities, such as assisting with weapons creation or cyber attacks.
Painter explained that companies have created testing frameworks with different names but similar core concepts for identifying dangerous capabilities. However, he noted that these approaches remain voluntary and lack standardization across the industry. Current testing, he observed, focuses on detecting when dangerous capabilities arrive rather than proving systems are safe or reliable.
### Fundamental Limitations of Current AI Safety Science
Roman Yampolskiy presented a sobering assessment, describing the state of AI alignment and control science as “mostly non-existent.” He highlighted a critical asymmetry: “while progress in AI is really exponential or hyper-exponential in terms of capabilities, progress in safety, progress in our ability to control those systems is linear at best, if not constant.”
Yampolskiy argued that traditional testing paradigms break down when dealing with systems that may exceed human intelligence: “If a system is smarter than you, more complex, more creative, it’s capable of doing something you didn’t anticipate. So we don’t know how to test for bugs we haven’t seen before.”
## Rapid-Fire Policy Recommendations
In the final round, each panelist provided specific recommendations for policymakers:
**Rachel Adams** emphasized the need for inclusive standard-setting processes that incorporate diverse global perspectives, particularly from developing countries.
**Boulbaba Ben Amor** called for industry-specific benchmarking frameworks and greater focus on testing complete AI products rather than just models.
**Anja Kaspersen** stressed the importance of bringing technical professional organizations into governance conversations and maintaining verification as an ongoing process.
**Chris Painter** highlighted the need for standardization of frontier safety policies and dangerous capability evaluations across the industry.
**Roman Yampolskiy** warned about the fundamental limitations of current approaches and the need for entirely new frameworks for AI safety and control.
## Key Challenges Identified
The discussion revealed several critical challenges in AI testing and verification:
1. **Systematic exclusion** of developing countries from standard-setting processes
2. **Mismatch** between testing AI models versus testing real-world AI products
3. **Lack of standardization** in frontier safety policies across companies
4. **Terminology confusion** between technical and policy communities
5. **Fundamental limitations** in current AI safety and control science
6. **Asymmetry** between rapid capability advancement and slower safety progress
## Conclusion
The panel discussion highlighted both the urgency and complexity of AI verification challenges. While speakers approached the problem from different angles—global inclusivity, product-focused testing, professional standards, frontier safety, and fundamental limitations—they collectively demonstrated that current approaches to AI testing and verification require significant improvement.
The conversation revealed that testing for trust in AI systems involves not just technical challenges but complex socio-technical problems requiring collaboration across disciplines, sectors, and regions. The path forward requires addressing systematic exclusions in standard-setting, shifting focus from models to products, standardizing safety policies, clarifying terminology, and potentially developing entirely new frameworks for AI safety and control.
Session transcript
Robert Trager: All right, so welcome, welcome back. Still here. Yes, this panel is entitled, Can we test for trust? The verification challenge in AI. And this is actually something that we have been talking about, in fact, in some intensive discussions here over the last few days, and very much looking to make progress on. So I can report just a little bit on that, on those discussions, and then we will turn to this distinguished panel. So just over the last few days, and all of these folks here have been participating in these discussions, we have convened a globally representative group representing at least all continents, except for Antarctica, a multi-stakeholder group with the recognition that governments are working on these problems, academia, industry, civil society, et cetera, are working hard, and no one group has here the solutions. And of course, technical experts, with the goal of identifying gaps in the current ecosystem and exploring solutions to address those gaps and needs. So we discussed things like capacity building for testing of AI systems broadly in the world, how can different parts of the world collaborate and interact with each other in order to effectively test systems and ensure their reliability. We discussed best practices, standards, institutional frameworks for collaboration, and we talked about potential next steps, including extending the dialogue, establishing an ongoing discussion group, identifying priorities and gaps, and maybe having groups that can begin the process of moving from research, because there’s so much research happening in the space, to pre-standardization, and then hopefully from pre-standardization to standardization. And that’s a pipeline that I think is a difficult one, it’s a challenging one, given the fast evolution of this space, but it’s one that I think, happily, many people around the world are committed to working on. And so are all of these distinguished folks here, so I’ll briefly introduce them. We have Anja Kasperssen, Director for Global Markets Development, New Frontiers, and Emerging Spaces at IEEE, a global charity and the world’s largest technical professional organization. Welcome. Boulbaba Ben Amour, Director of AI for Good at Inception, a G42 company based in Abu Dhabi, creating industry-specific AI products built to solve real-world challenges. Roman V. Yampolskiy, Professor at the University of Louisville, whose work on AI safety and cybersecurity has highlighted existential risk posed by advanced AI. Chris Painter, Director of Policy at Meter, a non-profit organization studying AI capabilities and the risks they may pose. And finally, Dr. Rachel Adams, Founder and CEO of the Global Center on AI Governance, an Africa-based organization advancing equitable AI governance worldwide. So Anja, I think maybe I will ask you first to summarize some of the key discussion questions that we’ve been addressing in the workshop and some of the takeaways as you see them.
Rachel Adams: Yeah, thank you so much. Let me just make sure this is on. Good morning, everyone. You know, I think it’s important to reflect on the fact that historically standard-setting processes have not adequately included representation from developing countries around the world and often resulted in the establishment of barriers to entry for smaller market players and for these economies. So there is a really important opportunity that we have now when we’re thinking about this pipeline and how we do it intentionally and inclusively to do it and do it differently. At the moment, the kind of pre-deployment testing processes that are going on are centered in very, very distinct parts of the world that have the resources, the skills, and the expertise to be able to do this testing and to do this testing on behalf of the world. And I think one of the things that came out of the discussion yesterday is very often it’s the people that are designing and developing the models that are designing and developing the benchmarks against which the performance of these models are being tested. So certainly from a kind of African perspective, we’re thinking a lot about how a multistakeholder, multidisciplinary approach to testing can help us to develop much more trustworthy technology because it’s able to pull expertise from very different perspectives and worldviews, to be able to co-create evaluation assets that really reflect very distinct local realities, conditions, and risks. How these technologies fail in different parts of the world is a really attenuated concern and it’s not a concern that can be answered by one group of people with one distinct kind of worldview. So these are some of the things we’re thinking about.
Robert Trager: Great. Thank you so much. Maybe I’ll turn here to Mr. Ben Amour. So how can reliability and standards and benchmarks speed AI adoption and what are the next steps in standards and benchmarks development internationally?
Boulbaba Ben Amor: Good morning, everyone. Thanks for having me here. So I think that in terms of standards, the body settings of the standards are doing a great job in formulating the standards, publishing different reports on the standards. They need actually to make some efforts, more efforts in terms of harmonization for less fragmentation of the standardization so we can have convergence in terms of standardization. And keep the standardization as well flexible and agile so they can follow the new technology advancement in terms of AI. In terms of benchmarking, that’s really what we need and we need to double down and multiply the efforts in terms of benchmarking to test the robustness of AI, to test the safety of AI and the fairness of AI. There is a reason also behind that, which is as we are now closer and closer to the final customers and the society, we need to adapt more and more AI solutions to the fields. So serving the healthcare field will be different from serving the education or serving the defense or the oil and gas industry or the finance industry. Each of these fields will need different benchmarks. So we need to work more and more, double down on the funding of research and development projects to test the safety of these AI solutions. Again, a few months ago we were talking about large language models, then frontier AI models which comes with more capabilities, reasoning capabilities, writing codes, reviewing codes, even proposing some scientific discoveries like generating proteins or predicting the structure of the proteins. That needs also diverse and more specific benchmarking. So I want to here to call for action, benchmarking, benchmarking, benchmarking. Of course, government will be customers of these AI technologies. I want to take the example of education and healthcare. So these are the regulatory bodies in the countries, in the continents that will be showing the way how we will apply AI. So then ministries of education, of healthcare need to be part of the high-level design of AI systems to guarantee that these AI systems are serving the best way the actors in these different industries. In education, for example, we have different actors. We have the students, we have the researchers, we have the administrative staff. We need to serve all of them in different ways and we need to guarantee that the solutions we are providing are safe. The other point that I would like to highlight is that today we are talking more and more about agentic AI solutions rather than models, right? So we need also to adapt the trustworthy methodology to test more and more these agentic AI solutions which will have more autonomy to take decisions than without the intervention of humans. So all this needs a particular effort of standardization and benchmarking, benchmarking, benchmarking. Thank you.
Robert Trager: Thank you very much. All right. I’ll turn now to Ms. Caspersen. So you’re representing the largest technical professional organization. How do you see professional organizations having an impact on AI governance?
Anja Kaspersen: Massively so. So let me, I’m just gonna rewind a little bit to our title of this session if you allow me and share some observations. Just first of all thank you to ITU for having me and for having us. Naturally I’m not speaking on behalf of the technical community which is vast, many of which are represented here as well, but I will try to share a couple of observations that I think might help frame our thinking around issues of verification. And one of the challenges that we’ve seen is of course to verify AI systems so that reliance become justified and contestable, grounded in external evidence. And it’s important to come back to this notion of trust, which is not a property of machines. It is how institutions and societies navigate uncertainty. And too often these discussions reduce complex issues to broad slogans framed as trust, responsibility, or similar terms, which obscure fundamentally different concerns. And robustness is under stress is not fairness. Privacy is not traceability. Resilience is not transparency. Each of these terms and each of these processes demands distinct technical methods, oversight structures, and lines of accountability. That is why professional communities, including the IEEE, the largest global technical organization in the world, looks at really integrating ethical considerations in all of our processes and standards-making processes. Sorry. And we are deliberately separating some of the concerns to make sure that we look at narrow testable properties. This is not a way of pushing away ethics. It’s actually how ethics become operational. By translating values like fairness or accountability into explicit verifiable properties, these frameworks make ethical commitments contestable as well as enforceable. We heard over the last few days a lot of use of different terminology, and nomenclature really matters when you’re trying to govern this space. And it is very critical, and many have said this before me on this stage, to avoid overfitting loose concepts like trust, trustworthiness, responsible, or even agency onto systems where these remained underspecified. There is a risk that such abstractions too easily drift into marketing or policy language, bypassing the interpretive or, as I like to say, the hermeneutic work needed to ground verification in specific examinable terms. You know, we heard in the previous session, we heard over the several courses of conversations here, this use of safety. And it’s important to remind ourselves that safety itself is not a singular concept. And technical communities may emphasize reliability and fault tolerance, while policy communities often prioritize societal impacts or human rights threshold. Thus, verification cannot treat safety as uniform. It must address these varied interpretations to explicit, contestable standards. And from my own experience working in the verification space, but more around arms control and disarmament, it is clear that formal compliance is never enough. Systems can pass technical checks, yet fail under real-world conditions. Thus, verification cannot stop at nominal tests. And it’s important that failures only, and many failures only, appear when systems operate on the compounded, realistic stresses. So these are very important as we look at verification. It’s not one thing. There is a lot of experience looking at verification across different domains, which we can learn from in the AI space. And especially when it comes to the social technical dynamics that shape how societies absorb disruption. And as my colleague here was saying, this is particularly important points when we talk about generative models that do not just map inputs to outputs, but they frequently adopt, shift, and influence exactly how decisions are framed or even done to begin with. So I’ll leave it there, and then we come back to more, I’m sure, with your questioning. Thank you.
Robert Trager: Thank you so much. I want to turn now to Chris Painter, Director of Policy at Metre. So you’ve been working hard to develop best practices in the space of advanced AI, with the ultimate goal of getting companies to adopt those best practices. What have you learned, and where are we in that journey?
Chris Painter: Yeah, I get nervous hearing the term best practices, because it feels like kind of the first-pass practices or something. But I can speak to what, you know, the train of AI feels like it’s moving down the tracks very quickly, and so many people are trying to figure out very quickly what is worth testing for, and what the first set of risk thresholds are that could arise from AI model development that would need to be responded to. So I can speak a little bit to that. So the many companies, both in the US and internationally, have created these testing and evaluation frameworks. They go by many different names at different companies. You have the Open AI Preparedness Framework, the Anthropic Responsible Scaling Policy. Other people have used different terms. The kind of catch-all term that we use for this is frontier safety policies, but you can call them what you want to. And the core idea in these policies is asking AI developers to state specific evaluations that, if passed, would make them think they have a model on their hands today that they don’t know how to safely and securely handle. And this is kind of downstream of the field of AI dangerous capability evaluations, where we’re talking about AI evaluations that measure something you, in some abstract sense, hope to not see AI systems get better at, right? So this is things like, can this AI system assist with the creation of some kind of weapon? Can it automate some kind of cyber attack? Can it assist in the synthesis of a biological weapon? Or can it do some dangerous string of tasks autonomously? And these are also threats, I should say, that are kind of asymmetric, large-scale risks to public safety and security. Things that wouldn’t show up in the normal process of product reliability. I think all, I think, you know, many other categories of things are important to test for. So I’m only speaking to that narrow slice of dangerous capability evaluations that are occurring at AI labs. But, so like I mentioned, the idea with this is you say specific evaluations that, if passed, would make you think you have a model on your hands that you are unprepared to safely handle. So an example of this might be, if we see that an AI system is able to assist technical novices at, so undergrad students in biology, for instance, is able to assist them in synthesizing a biological weapon that they otherwise would, or chemical weapon, that they previously, without the presence or access to AI, would have been unable to synthesize, then perhaps we need additional stricter information security on our lab. We need some reliable process that can be red-teamed for making sure that we’re denying that capability via an API or via the front door kind of product interface to the model that we’ve developed. And what happens when you set out this kind of series of conditional red lines is you get a sort of manifold for, if you identify a series of threat models that you want to monitor for, it lets you kind of back out of this a roadmap for where safety and security on the level of like lab model mitigation, so what is the the model refusing to assist with, or what kinds of tasks is it refusing, and how strict is your labs information security. You sort of back out of these policies a roadmap for where companies will need to be in the future on those practices to ensure that the safety and security mitigations are keeping up with model capability. And the nice process or property that this has is it lets you kind of move past kind of abstract debate around how fast AI dangerous capabilities are progressing, and it lets you instead talk concretely about what for every individual threat model, how will you operationalize when the model is capable enough that a new risk outcome has become more likely, right. So you can say rather than talking in the abstract about is AI progress going too quickly or too slowly on any of these capabilities, these dual use capabilities, you can say what specific thing if we observed it would make us think that some kind of response was warranted. And this has become a sort of you know pseudo standard in some sense where many of the companies have published these policies so you can directly compare both the detail that goes into them and the adherence of the the companies to these policies. Are they publishing evaluations that describe how they’ve ruled out each of these capabilities for their models. And there’s one other thing I was gonna say about it. Oh I think one one thing that also is nice about this is it promotes clarity inside of AI developers where they know if we it’s a it makes it clear that progress on things like information security and model level mitigations or monitoring model capabilities internally could become an actual roadblock to further progress, right. It can the safety and security progress becomes a rate limiting step in further model development which I think is a good thing for companies to be clear about. I think that there are many kind of issues you know with it with the paradigm as it’s currently set up. All of this is happening in a voluntary context. There isn’t standardization among you know even the practices that the companies are employing. These are all unilaterally described and they’re more kind of like illustrative roadmaps of what the companies believe the threat models will be and where they think security and safety will need to be rather than a formal kind of coordinated commitment throughout industry. And yeah.
Robert Trager: Thanks Chris. Great. So we have a few minutes left. I want to turn to Professor Yampolskiy. You’re known for working in the areas of alignment and control. So alignment of AI systems with human values, control of those systems. How would you characterize the state of science there?
Roman V. Yampolskiy: Mostly non-existent. He said mostly non-existent. Is not well defined. We don’t agree on values. Obviously, different societies, different cultures, religions all have their own sets of values. Philosophers failed after two millennia of research to come to common ethical moral conclusions. And if we somehow agreed on it, it’s dynamically changing, right? What was ethical 100 years ago is definitely not ethical today. It will continue evolving. And if somehow we went to a static model of those values, we still don’t know how to program those values into systems. On top of it, the systems are not static themselves. They are changing, they are learning, they are self-improving, they interact with users, they learn from new data. We anticipate that at some point we’ll get to human level and go beyond, meaning those systems will be smarter than us. So the paradigms of testing we have today where we predict what to test for, synthetic bio threats and things like that, they only work when you can list all the possible dangers. If a system is smarter than you, more complex, more creative, it’s capable of doing something you didn’t anticipate. So we don’t know how to test for bugs we haven’t seen before. General testing process allows me to say, oh, I found two problems, I found two bugs, I reported them, I fixed them, but it doesn’t say that I proved the system is bug-free. It could still have many problems remaining. And even if during initial testing phase we did well, it doesn’t mean that new capabilities are not added after deployment. So there is really very little we know how to do in this space. There’s strict limits for understanding how the system works, explaining its behavior, predicting its behavior, and verifying code. There is verification with respect to a specific verifier AI or human mathematician model, but that’s just infinite regressive verifiers. You can still have bugs, it’s just the verifier itself was not perfect. So while progress in AI is really exponential or hyper-exponential in terms of capabilities, progress in safety, progress in our ability to control those systems is linear at best, if not constant.
Robert Trager: Thank you. Yes. Well, a half an hour goes quickly when we all have a lot of really interesting things to say. So I think what we will do is turn to the rapid-fire round now. So I’m going to ask each of you, based on your area of expertise, what is a lens or a point of view that is currently not being considered by policymakers when discussing approaches to testing and mitigating risks from AI? And maybe we’ll just go down this way, please. Thank you.
Boulbaba Ben Amor: Thank you. So while the major effort is turned to evaluate and benchmark AI models, we need to move as quick as possible to evaluate AI solutions, AI products, and AI features. Because companies are making a great job in building their solutions, and their solutions are based on primary models, but around these primary models, there are also safety solutions, security solutions, and so on and so forth. So we need the policymakers to focus more on product conformity, on the adaptation also of risk assessments for the AI solutions. While it is important to keep working on the evaluations of AI models, we need to go as soon as possible to evaluate the AI solutions and the AI products. This is what the end users and the society will get to use at the end of the day. And again, this is what we are trying to do at G42 as well. We are establishing a responsible AI foundation, where we’ll have an AI safety institute doing all the external and the independent evaluations of different models coming from third parties. We will have a research lab doing AI for good research on different projects for biodiversity, sustainability, and so on, and serving also low represented populations. So we need to diversify also the, and to include in the different parts of the AI community. So we need to diversify also the, and to include in the different policies, different cultures, different languages, different fields, including the scientific discoveries, which will change a lot with AI and the imagination of AI. Thank you.
Anja Kaspersen: I think, you know, following up on the issue of language, I think one thing that we often overlook is the terminology piece. So I think there’s a great need for policymakers and technologists to sit down together and try to actually go through the classic terminology. I’m not talking about taxonomies now, but the actual comprehension of terms, because there’s a lot of terms in a technical space that means something very different in the policy space and vice versa. And that gets in the way of really robust verification efforts. It’s also that verification must be understood as an open-ended, continually adaptive process, just as these technologies are fast adapting. I just want to take us back to a conversation that happened at this stage last year. It was actually the head of OpenAI in a conversation with Nicholas Thompson, and towards the end of the conversation, he said something really interesting, where he was predicting that what happens in the governance space, and I’m quoting him, would asymptote off. It is a mathematical term. So implying that as technology evolves quickly, the governance piece would not be able to keep up. And I think that’s an important point to make sure that that does not happen, that there will become two parallel routes where governance evolves at its own speed, and technology evolves in a different one. Then just one very quick point, because I think the Greek minister that was here earlier made a really important point around practical wisdom, and I’m probably going to pronounce this wrongly, but in Greek there’s a word called phrenesis, which is exactly this practical wisdom. We need to hone in to avoid hubris, because there’s a lot of hubris going on, and hubris is a way of masking the fact that these are about dynamics of power and distribution of power, and to fight back against hubris, the best way is to use practical wisdom and really pull on that collective insights. Goes back to your first question, Robert, does the technology community and technical community play a big role? Absolutely. Most of the technical communities, like my own, are independent, neutral in this space, but too often not pulled into the conversations when it really matters, before things are being deployed and before governance principles are being set. Thank you.
Robert Trager: Thank you. Yes.
Rachel Adams: Yeah, thank you so much, and I also wanted to pick up on a point you made, but also a point that the minister from Greece made in the previous panel. When we’re thinking about how we test for trust, there’s a really significant difference between, is a technology trustworthy? Does it work in the way we expect it to work? And do people trust in AI technologies? And one of the ways in which we can test for that is public perception and social attitude surveys, which are entirely missing from the African continent. We have, so please watch this space, just done the first public perception survey on AI in South Africa, and the findings are extremely, extremely interesting. So I think when we’re thinking about how do we put this in a broader social context, this is a really important indicator to look at.
Robert Trager: Thank you. Yes, professor.
Roman V. Yampolskiy: Any legal solution, any governance solution requires technical ability to monitor capabilities of that system. We cannot monitor live training runs of large, complex AI systems, and even after training completes, it takes six months, a year to fully test it to understand what capabilities it has, and after deployment, we still discover new capabilities in those systems. So the legal measures cannot meaningfully be implemented
Robert Trager: Thank you. Chris, final word.
Chris Painter: Yeah, I think the thing I would say is something like, if you see that lots of testing is going on today for AI systems, you might think that there is some safe, idealized AI system that developers know how to build that they’re working back from, and I want to emphasize that that’s not what’s happening. The testing that is occurring is people checking against capabilities, or some of the testing that’s happening is people checking against dangerous capabilities that they expect the models will become proficient at very soon, and so it’s more about gathering evidence and trying to detect the moment when these dangerous capabilities arrive in AI systems, and I think that’s really important for governments and the public to understand, right? This isn’t about, or this, some of the tests that are happening aren’t measuring, aren’t able to actually like check reliability or whether the system is safe. They’re waiting to detect the moment when the evidence arises that these, you know, somewhat alarming capabilities are coming online.
Robert Trager: Great, thank you to all of them, to all of you. We’ll have to leave it there. We had a number of gaps, quite a few, I think very significant gaps identified, as well as some solutions, and in the spirit of the three secrets of French cooking, or butter, butter, and butter, perhaps the three secrets to AI governance are benchmarking, benchmarking, and benchmarking, but please join me in thanking this rich set of panelists. Thank you so much. Many thanks indeed, some great pointers here.
Rachel Adams
Speech speed
146 words per minute
Speech length
435 words
Speech time
177 seconds
Historical standard-setting processes have excluded developing countries and created barriers for smaller market players
Explanation
Adams argues that traditional standard-setting processes have not adequately included representation from developing countries and have often resulted in barriers to entry for smaller market players and economies. She emphasizes the importance of doing things differently now with intentional and inclusive approaches.
Evidence
She mentions this is particularly important from an African perspective and notes the opportunity to establish processes that are more inclusive than historical ones.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Development | Legal and regulatory
Agreed with
– Robert Trager
– Boulbaba Ben Amor
Agreed on
Need for inclusive and globally representative approaches to AI testing and standards
Pre-deployment testing is currently centered in resource-rich regions that test on behalf of the world
Explanation
Adams points out that current pre-deployment testing processes are concentrated in very distinct parts of the world that have the resources, skills, and expertise to conduct testing. These regions are essentially testing on behalf of the entire world.
Evidence
She notes that often the same people designing and developing models are also designing the benchmarks against which these models are tested.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Development | Legal and regulatory
Multistakeholder, multidisciplinary approaches can develop more trustworthy technology by incorporating diverse perspectives and worldviews
Explanation
Adams advocates for multistakeholder, multidisciplinary approaches to testing that can help develop more trustworthy technology. She argues this approach can pull expertise from different perspectives and worldviews to co-create evaluation assets that reflect local realities, conditions, and risks.
Evidence
She emphasizes that how technologies fail in different parts of the world is a concern that cannot be answered by one group of people with one distinct worldview.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Development | Sociocultural
Agreed with
– Robert Trager
– Boulbaba Ben Amor
Agreed on
Need for inclusive and globally representative approaches to AI testing and standards
Public perception and social attitude surveys are missing indicators for testing trust, particularly in regions like Africa
Explanation
Adams distinguishes between whether a technology is trustworthy versus whether people trust AI technologies. She argues that public perception and social attitude surveys are entirely missing from the African continent and are important indicators for understanding trust in a broader social context.
Evidence
She mentions conducting the first public perception survey on AI in South Africa with extremely interesting findings.
Major discussion point
Benchmarking and Testing Methodologies
Topics
Development | Sociocultural
Boulbaba Ben Amor
Speech speed
133 words per minute
Speech length
743 words
Speech time
333 seconds
Standards bodies need harmonization to reduce fragmentation and maintain flexibility to follow AI technology advancement
Explanation
Ben Amor argues that while standards bodies are doing good work in formulating and publishing standards, they need to make more efforts toward harmonization to reduce fragmentation. He emphasizes keeping standardization flexible and agile to follow new technology advancement in AI.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Legal and regulatory | Infrastructure
Different fields require different benchmarks – healthcare, education, defense, finance each need specific testing approaches
Explanation
Ben Amor emphasizes that as AI solutions get closer to final customers and society, they need to be adapted to specific fields. Each field – healthcare, education, defense, oil and gas, finance – will need different benchmarks for testing robustness, safety, and fairness.
Evidence
He calls for doubling down on funding research and development projects to test AI solution safety, noting the evolution from large language models to frontier AI models with reasoning capabilities.
Major discussion point
Benchmarking and Testing Methodologies
Topics
Legal and regulatory | Economic
Need to move from testing AI models to testing AI solutions, products, and features that end users actually interact with
Explanation
Ben Amor argues that while major efforts focus on evaluating AI models, there’s a need to quickly move to evaluating AI solutions, products, and features. He emphasizes that companies build solutions based on primary models but include additional safety and security solutions around them.
Evidence
He mentions G42’s establishment of a responsible AI foundation with an AI safety institute for external evaluations and a research lab for AI for good projects.
Major discussion point
Benchmarking and Testing Methodologies
Topics
Legal and regulatory | Economic
Agreed with
– Chris Painter
– Roman V. Yampolskiy
Agreed on
Current testing approaches are insufficient and need fundamental improvements
Disagreed with
– Chris Painter
– Roman V. Yampolskiy
Disagreed on
Focus of testing efforts – models vs. products vs. capabilities
Anja Kaspersen
Speech speed
157 words per minute
Speech length
1013 words
Speech time
386 seconds
Trust is not a property of machines but how institutions and societies navigate uncertainty
Explanation
Kaspersen argues that trust should not be viewed as a machine property but rather as how institutions and societies manage uncertainty. She criticizes the reduction of complex issues to broad slogans like trust or responsibility that obscure fundamentally different concerns.
Evidence
She provides examples showing that robustness under stress is not fairness, privacy is not traceability, and resilience is not transparency – each requiring distinct technical methods and oversight structures.
Major discussion point
Terminology and Conceptual Clarity in AI Governance
Topics
Legal and regulatory | Sociocultural
Different concepts like robustness, fairness, privacy, and transparency require distinct technical methods and oversight structures
Explanation
Kaspersen emphasizes that various AI-related concepts demand separate technical approaches and accountability structures. She argues this separation is not about pushing away ethics but making ethical commitments operational and enforceable through explicit verifiable properties.
Evidence
She mentions IEEE’s approach of integrating ethical considerations in standards-making processes by deliberately separating concerns to look at narrow testable properties.
Major discussion point
Terminology and Conceptual Clarity in AI Governance
Topics
Legal and regulatory | Infrastructure
Agreed with
– Roman V. Yampolskiy
Agreed on
Terminology and conceptual clarity are critical challenges in AI governance
Policymakers and technologists need to align on terminology comprehension to enable robust verification efforts
Explanation
Kaspersen argues there’s a great need for policymakers and technologists to collaborate on understanding terminology, as many technical terms mean something different in policy spaces and vice versa. This misalignment interferes with robust verification efforts.
Major discussion point
Terminology and Conceptual Clarity in AI Governance
Topics
Legal and regulatory | Sociocultural
Agreed with
– Roman V. Yampolskiy
Agreed on
Terminology and conceptual clarity are critical challenges in AI governance
Verification must be understood as an open-ended, continually adaptive process
Explanation
Kaspersen emphasizes that verification should be viewed as an ongoing, adaptive process that evolves alongside fast-adapting technologies. She warns against governance and technology becoming parallel tracks evolving at different speeds.
Evidence
She references a conversation where the head of OpenAI predicted governance would ‘asymptote off’ – implying governance wouldn’t keep up with technology evolution.
Major discussion point
Terminology and Conceptual Clarity in AI Governance
Topics
Legal and regulatory
Technical communities like IEEE must be included in governance conversations before deployment and policy-setting
Explanation
Kaspersen argues that technical communities play a massive role in AI governance and should be included in conversations before deployment and governance principles are set. She emphasizes that most technical communities are independent and neutral but often not consulted when it matters most.
Evidence
She mentions IEEE as the largest global technical organization that integrates ethical considerations in standards-making processes.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Legal and regulatory | Infrastructure
Chris Painter
Speech speed
176 words per minute
Speech length
1147 words
Speech time
389 seconds
Companies have created testing frameworks with different names but similar core concepts for identifying dangerous capabilities
Explanation
Painter explains that many companies have developed testing and evaluation frameworks with various names like OpenAI’s Preparedness Framework and Anthropic’s Responsible Scaling Policy. These frameworks, collectively called frontier safety policies, ask developers to identify evaluations that would indicate they have a model they don’t know how to safely handle.
Evidence
He provides examples of dangerous capability evaluations like assisting with weapon creation, automating cyber attacks, or helping synthesize biological weapons.
Major discussion point
Industry Standards and Regulatory Approaches
Topics
Cybersecurity | Legal and regulatory
Disagreed with
– Roman V. Yampolskiy
– Boulbaba Ben Amor
Disagreed on
Feasibility and current state of AI safety and control
Current industry practices are voluntary and lack standardization across companies
Explanation
Painter notes that all current testing practices happen in a voluntary context without standardization among companies. These are unilateral descriptions that serve more as illustrative roadmaps of what companies believe threat models will be rather than formal coordinated commitments.
Major discussion point
Industry Standards and Regulatory Approaches
Topics
Legal and regulatory | Economic
Frontier safety policies help companies identify dangerous capabilities and set conditional red lines for model development
Explanation
Painter describes how frontier safety policies create a framework for companies to set conditional red lines based on specific threat models. This approach allows moving past abstract debates about AI progress speed to concrete discussions about what specific observations would warrant responses.
Evidence
He gives an example of undergraduate biology students being able to synthesize biological weapons with AI assistance that they couldn’t create without AI.
Major discussion point
Benchmarking and Testing Methodologies
Topics
Cybersecurity | Legal and regulatory
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Explanation
Painter emphasizes that current testing is not about checking against some idealized safe AI system that developers know how to build. Instead, it’s about gathering evidence and detecting when dangerous capabilities emerge in AI systems.
Major discussion point
Benchmarking and Testing Methodologies
Topics
Cybersecurity | Legal and regulatory
Agreed with
– Roman V. Yampolskiy
– Boulbaba Ben Amor
Agreed on
Current testing approaches are insufficient and need fundamental improvements
Disagreed with
– Boulbaba Ben Amor
– Roman V. Yampolskiy
Disagreed on
Focus of testing efforts – models vs. products vs. capabilities
Roman V. Yampolskiy
Speech speed
154 words per minute
Speech length
430 words
Speech time
167 seconds
Alignment and control science is mostly non-existent with undefined values that vary across cultures and change over time
Explanation
Yampolskiy argues that AI alignment and control science is largely non-existent because values are not well-defined, vary across societies and cultures, and change dynamically over time. He notes that philosophers have failed after two millennia to reach common ethical conclusions.
Evidence
He points out that what was ethical 100 years ago is not ethical today and will continue evolving, and even if values were agreed upon, we don’t know how to program them into systems.
Major discussion point
Limitations of Current AI Safety and Control
Topics
Human rights | Sociocultural
Agreed with
– Anja Kaspersen
Agreed on
Terminology and conceptual clarity are critical challenges in AI governance
Disagreed with
– Chris Painter
– Boulbaba Ben Amor
Disagreed on
Feasibility and current state of AI safety and control
Cannot test for bugs or capabilities not yet anticipated, especially when systems become smarter than humans
Explanation
Yampolskiy argues that current testing paradigms only work when you can list all possible dangers, but systems smarter than humans will be capable of things we didn’t anticipate. General testing can find specific bugs but cannot prove a system is bug-free or prevent new capabilities from emerging after deployment.
Evidence
He explains that finding and fixing two bugs doesn’t prove the system is bug-free, and new capabilities can be added after deployment.
Major discussion point
Limitations of Current AI Safety and Control
Topics
Cybersecurity | Legal and regulatory
Agreed with
– Chris Painter
– Boulbaba Ben Amor
Agreed on
Current testing approaches are insufficient and need fundamental improvements
Disagreed with
– Boulbaba Ben Amor
– Chris Painter
Disagreed on
Focus of testing efforts – models vs. products vs. capabilities
Legal and governance solutions require technical monitoring capabilities that don’t currently exist for complex AI systems
Explanation
Yampolskiy argues that any legal or governance solution requires the technical ability to monitor system capabilities, but we cannot monitor live training runs of large, complex AI systems. Even after training, it takes months to fully test and understand capabilities, with new ones still being discovered after deployment.
Major discussion point
Limitations of Current AI Safety and Control
Topics
Legal and regulatory | Cybersecurity
Progress in AI capabilities is exponential while progress in safety and control is linear at best
Explanation
Yampolskiy contrasts the exponential or hyper-exponential progress in AI capabilities with the linear at best (if not constant) progress in safety and our ability to control these systems. This creates a growing gap between what AI can do and our ability to safely manage it.
Major discussion point
Limitations of Current AI Safety and Control
Topics
Legal and regulatory | Development
Disagreed with
– Chris Painter
– Boulbaba Ben Amor
Disagreed on
Feasibility and current state of AI safety and control
Robert Trager
Speech speed
122 words per minute
Speech length
828 words
Speech time
405 seconds
Globally representative multi-stakeholder collaboration is essential for AI testing and verification
Explanation
Trager emphasizes that governments, academia, industry, and civil society must work together as no single group has all the solutions. He advocates for a globally representative approach that includes all continents and multiple stakeholders to address gaps in the current AI testing ecosystem.
Evidence
He mentions convening a globally representative group representing at least all continents except Antarctica, including multi-stakeholder participation from governments, academia, industry, and civil society.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Legal and regulatory | Development
Agreed with
– Rachel Adams
– Boulbaba Ben Amor
Agreed on
Need for inclusive and globally representative approaches to AI testing and standards
There needs to be a pipeline from research to pre-standardization to standardization in AI testing
Explanation
Trager identifies the need for a structured progression from research activities to pre-standardization and then to formal standardization. He acknowledges this pipeline is challenging given the fast evolution of AI but notes that many people worldwide are committed to working on it.
Evidence
He mentions discussions about establishing ongoing discussion groups, identifying priorities and gaps, and having groups that can move from research to pre-standardization to standardization.
Major discussion point
Industry Standards and Regulatory Approaches
Topics
Legal and regulatory | Infrastructure
Capacity building for AI testing needs global collaboration and interaction
Explanation
Trager argues that different parts of the world need to collaborate and interact with each other to effectively test AI systems and ensure their reliability. This involves building capacity for testing AI systems broadly across the world rather than concentrating expertise in limited regions.
Evidence
He mentions discussions about capacity building for testing of AI systems broadly in the world and how different parts of the world can collaborate.
Major discussion point
Inclusive and Global Representation in AI Testing and Standards
Topics
Development | Infrastructure
Agreements
Agreement points
Need for inclusive and globally representative approaches to AI testing and standards
Speakers
– Rachel Adams
– Robert Trager
– Boulbaba Ben Amor
Arguments
Historical standard-setting processes have excluded developing countries and created barriers for smaller market players
Multistakeholder, multidisciplinary approaches can develop more trustworthy technology by incorporating diverse perspectives and worldviews
Globally representative multi-stakeholder collaboration is essential for AI testing and verification
Need to diversify also the, and to include in the different policies, different cultures, different languages, different fields
Summary
All three speakers emphasize the critical importance of moving beyond historically exclusive processes to create truly global, inclusive approaches that incorporate diverse perspectives, cultures, and stakeholders in AI testing and governance.
Topics
Development | Legal and regulatory | Sociocultural
Current testing approaches are insufficient and need fundamental improvements
Speakers
– Chris Painter
– Roman V. Yampolskiy
– Boulbaba Ben Amor
Arguments
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Cannot test for bugs or capabilities not yet anticipated, especially when systems become smarter than humans
Need to move from testing AI models to testing AI solutions, products, and features that end users actually interact with
Summary
These speakers agree that existing testing methodologies are fundamentally limited – they cannot prove safety, cannot anticipate unknown capabilities, and often test the wrong things (models rather than actual products users interact with).
Topics
Legal and regulatory | Cybersecurity
Terminology and conceptual clarity are critical challenges in AI governance
Speakers
– Anja Kaspersen
– Roman V. Yampolskiy
Arguments
Different concepts like robustness, fairness, privacy, and transparency require distinct technical methods and oversight structures
Policymakers and technologists need to align on terminology comprehension to enable robust verification efforts
Alignment and control science is mostly non-existent with undefined values that vary across cultures and change over time
Summary
Both speakers highlight that unclear, undefined, or misaligned terminology creates fundamental barriers to effective AI governance and verification efforts.
Topics
Legal and regulatory | Sociocultural
Similar viewpoints
Both speakers recognize that AI testing capabilities are currently concentrated in wealthy regions, creating an imbalance where a few areas test on behalf of the entire world, and both advocate for building distributed global capacity.
Speakers
– Rachel Adams
– Robert Trager
Arguments
Pre-deployment testing is currently centered in resource-rich regions that test on behalf of the world
Capacity building for AI testing needs global collaboration and interaction
Topics
Development | Infrastructure
Both speakers emphasize the gap between technical communities and policy-making, with Kaspersen advocating for technical community inclusion and Painter noting the lack of coordinated industry standards.
Speakers
– Anja Kaspersen
– Chris Painter
Arguments
Current industry practices are voluntary and lack standardization across companies
Technical communities like IEEE must be included in governance conversations before deployment and policy-setting
Topics
Legal and regulatory | Infrastructure
Both speakers acknowledge fundamental technical limitations in our ability to monitor, understand, and verify AI systems, though they approach this from different angles – Yampolskiy from a theoretical impossibility perspective and Painter from a practical industry perspective.
Speakers
– Roman V. Yampolskiy
– Chris Painter
Arguments
Legal and governance solutions require technical monitoring capabilities that don’t currently exist for complex AI systems
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Topics
Legal and regulatory | Cybersecurity
Unexpected consensus
Limitations of current AI safety approaches across different expertise areas
Speakers
– Roman V. Yampolskiy
– Chris Painter
– Anja Kaspersen
Arguments
Progress in AI capabilities is exponential while progress in safety and control is linear at best
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Verification must be understood as an open-ended, continually adaptive process
Explanation
Despite coming from different backgrounds (academic AI safety, industry policy, and technical standards), these speakers converge on the fundamental inadequacy of current approaches to AI safety and verification. This consensus is unexpected because it spans theoretical and practical perspectives.
Topics
Legal and regulatory | Cybersecurity
Need to move beyond abstract concepts to concrete, operational approaches
Speakers
– Anja Kaspersen
– Boulbaba Ben Amor
– Chris Painter
Arguments
Different concepts like robustness, fairness, privacy, and transparency require distinct technical methods and oversight structures
Need to move from testing AI models to testing AI solutions, products, and features that end users actually interact with
Frontier safety policies help companies identify dangerous capabilities and set conditional red lines for model development
Explanation
These speakers from different sectors (technical standards, industry, and policy) unexpectedly agree on the need to move from abstract discussions to concrete, operational approaches – whether in terms of separating distinct technical concepts, testing actual products rather than models, or setting specific conditional thresholds.
Topics
Legal and regulatory | Economic
Overall assessment
Summary
The speakers demonstrate significant consensus on several critical issues: the need for more inclusive global approaches to AI governance, the fundamental limitations of current testing and safety approaches, the importance of terminology clarity, and the necessity of moving from abstract concepts to concrete operational frameworks. There is also broad agreement that current approaches are insufficient for the challenges posed by rapidly advancing AI systems.
Consensus level
High level of consensus on problem identification with emerging agreement on solution directions. This suggests a mature understanding of the challenges across different stakeholder groups, which could facilitate coordinated action. However, the consensus primarily focuses on what’s wrong with current approaches rather than specific solutions, indicating that while there’s agreement on problems, the path forward still requires significant collaborative work to develop concrete, implementable solutions.
Differences
Different viewpoints
Feasibility and current state of AI safety and control
Speakers
– Roman V. Yampolskiy
– Chris Painter
– Boulbaba Ben Amor
Arguments
Alignment and control science is mostly non-existent with undefined values that vary across cultures and change over time
Progress in AI capabilities is exponential while progress in safety and control is linear at best
Companies have created testing frameworks with different names but similar core concepts for identifying dangerous capabilities
Need to move from testing AI models to testing AI solutions, products, and features that end users actually interact with
Summary
Yampolskiy presents a deeply pessimistic view that AI safety and control science is ‘mostly non-existent’ and that we cannot meaningfully test for unknown capabilities, while Painter and Ben Amor describe existing industry frameworks and testing approaches as viable, though imperfect, solutions.
Topics
Legal and regulatory | Cybersecurity | Economic
Focus of testing efforts – models vs. products vs. capabilities
Speakers
– Boulbaba Ben Amor
– Chris Painter
– Roman V. Yampolskiy
Arguments
Need to move from testing AI models to testing AI solutions, products, and features that end users actually interact with
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Cannot test for bugs or capabilities not yet anticipated, especially when systems become smarter than humans
Summary
Ben Amor advocates for shifting focus from models to end-user products and solutions, Painter emphasizes detecting dangerous capabilities as they emerge, while Yampolskiy argues that testing paradigms fundamentally break down when systems exceed human intelligence.
Topics
Legal and regulatory | Cybersecurity | Economic
Unexpected differences
Optimism vs. pessimism about current AI safety progress
Speakers
– Roman V. Yampolskiy
– Chris Painter
– Boulbaba Ben Amor
Arguments
Alignment and control science is mostly non-existent with undefined values that vary across cultures and change over time
Companies have created testing frameworks with different names but similar core concepts for identifying dangerous capabilities
Different fields require different benchmarks – healthcare, education, defense, finance each need specific testing approaches
Explanation
The stark contrast between Yampolskiy’s assessment that AI safety science is ‘mostly non-existent’ and other speakers’ descriptions of existing frameworks and testing approaches reveals a fundamental disagreement about whether current efforts represent meaningful progress or inadequate responses to an intractable problem.
Topics
Legal and regulatory | Cybersecurity
Scope of what constitutes ‘trust’ in AI systems
Speakers
– Anja Kaspersen
– Rachel Adams
Arguments
Trust is not a property of machines but how institutions and societies navigate uncertainty
Public perception and social attitude surveys are missing indicators for testing trust, particularly in regions like Africa
Explanation
While both speakers address trust, they approach it from different angles. Kaspersen argues against treating trust as a machine property and emphasizes institutional/societal navigation of uncertainty, while Adams distinguishes between technical trustworthiness and public trust, advocating for social attitude surveys as missing indicators.
Topics
Sociocultural | Legal and regulatory
Overall assessment
Summary
The discussion reveals significant disagreements on the fundamental feasibility of AI safety and control, the appropriate focus for testing efforts, and the current state of progress in the field. While speakers generally agree on the need for inclusive, multistakeholder approaches, they differ substantially on implementation strategies and assessment of current capabilities.
Disagreement level
Moderate to high disagreement with significant implications. The fundamental disagreement between Yampolskiy’s pessimistic assessment and other speakers’ more optimistic views of existing frameworks suggests the field lacks consensus on basic questions of feasibility and progress. This disagreement has major implications for policy direction, resource allocation, and the urgency of regulatory responses.
Partial agreements
Partial agreements
Similar viewpoints
Both speakers recognize that AI testing capabilities are currently concentrated in wealthy regions, creating an imbalance where a few areas test on behalf of the entire world, and both advocate for building distributed global capacity.
Speakers
– Rachel Adams
– Robert Trager
Arguments
Pre-deployment testing is currently centered in resource-rich regions that test on behalf of the world
Capacity building for AI testing needs global collaboration and interaction
Topics
Development | Infrastructure
Both speakers emphasize the gap between technical communities and policy-making, with Kaspersen advocating for technical community inclusion and Painter noting the lack of coordinated industry standards.
Speakers
– Anja Kaspersen
– Chris Painter
Arguments
Current industry practices are voluntary and lack standardization across companies
Technical communities like IEEE must be included in governance conversations before deployment and policy-setting
Topics
Legal and regulatory | Infrastructure
Both speakers acknowledge fundamental technical limitations in our ability to monitor, understand, and verify AI systems, though they approach this from different angles – Yampolskiy from a theoretical impossibility perspective and Painter from a practical industry perspective.
Speakers
– Roman V. Yampolskiy
– Chris Painter
Arguments
Legal and governance solutions require technical monitoring capabilities that don’t currently exist for complex AI systems
Current testing detects when dangerous capabilities arrive rather than proving systems are safe or reliable
Topics
Legal and regulatory | Cybersecurity
Takeaways
Key takeaways
AI testing and verification must be globally inclusive, moving beyond current concentration in resource-rich regions to incorporate diverse perspectives and worldviews
Terminology alignment between policymakers and technologists is critical – terms like ‘trust,’ ‘safety,’ and ‘robustness’ have different meanings across domains and require precise definition
The focus must shift from testing AI models to testing AI solutions, products, and features that end users actually interact with
Benchmarking needs to be field-specific, with different industries (healthcare, education, finance, defense) requiring tailored evaluation approaches
Current AI safety and control science is largely non-existent, with fundamental challenges in defining values, predicting capabilities, and testing for unknown risks
Verification must be understood as an ongoing, adaptive process rather than a one-time assessment
There’s a critical gap between exponential AI capability progress and linear progress in safety and control measures
Current testing primarily detects when dangerous capabilities emerge rather than proving systems are safe or reliable
Resolutions and action items
Establish ongoing discussion groups to continue the dialogue on AI testing and verification
Create a pipeline moving from research to pre-standardization to standardization
Develop capacity building initiatives for AI system testing globally
Harmonize standards across different bodies to reduce fragmentation while maintaining flexibility
Include technical communities in governance conversations before deployment and policy-setting
Conduct public perception and social attitude surveys, particularly in underrepresented regions like Africa
Focus on product conformity and risk assessments for AI solutions rather than just models
Establish independent AI safety institutes for external evaluation of models from third parties
Unresolved issues
How to effectively monitor live training runs of large, complex AI systems
How to test for capabilities and risks that haven’t been anticipated, especially when systems become smarter than humans
How to align AI systems with human values when values differ across cultures and change over time
How to prevent governance and technology development from becoming parallel tracks evolving at different speeds
How to move from voluntary industry practices to standardized, enforceable requirements
How to address the fundamental limitation that current testing cannot prove systems are bug-free or completely safe
How to balance the need for rapid AI development with adequate safety and security measures
How to ensure meaningful global representation in standard-setting processes beyond just including more voices
Suggested compromises
Maintain flexibility in standards while working toward harmonization to accommodate rapid technological advancement
Focus on narrow, testable properties rather than broad concepts like ‘trust’ to make ethical commitments operational and enforceable
Develop frontier safety policies that set conditional red lines for dangerous capabilities while allowing continued development
Create multistakeholder approaches that balance technical expertise with diverse cultural perspectives and local realities
Establish independent evaluation processes while maintaining industry innovation and development speed
Pursue both model-level testing and solution-level testing in parallel rather than choosing one approach over the other
Thought provoking comments
Very often it’s the people that are designing and developing the models that are designing and developing the benchmarks against which the performance of these models are being tested… How these technologies fail in different parts of the world is a really attenuated concern and it’s not a concern that can be answered by one group of people with one distinct kind of worldview.
Speaker
Rachel Adams
Reason
This comment exposes a fundamental circular problem in AI testing – the inherent bias when creators test their own systems. It challenges the assumption that current testing methodologies are objective and highlights how failures manifest differently across global contexts, introducing the critical concept of cultural and geographical bias in AI evaluation.
Impact
This comment established a key theme that resonated throughout the discussion – the need for diverse, inclusive testing approaches. It influenced subsequent speakers to address global representation and helped frame the conversation around power dynamics in AI governance rather than just technical solutions.
Trust is not a property of machines. It is how institutions and societies navigate uncertainty… robustness under stress is not fairness. Privacy is not traceability. Resilience is not transparency. Each of these terms and each of these processes demands distinct technical methods, oversight structures, and lines of accountability.
Speaker
Anja Kaspersen
Reason
This comment fundamentally reframes the discussion by deconstructing the oversimplified concept of ‘trust’ in AI. It provides crucial conceptual clarity by separating distinct technical and social concepts that are often conflated, moving the conversation from abstract ideals to concrete, measurable properties.
Impact
This comment shifted the discussion from broad philosophical concepts to specific, actionable technical requirements. It influenced the conversation’s trajectory toward more precise terminology and helped establish that effective AI governance requires disaggregating complex concepts into testable components.
If a system is smarter than you, more complex, more creative, it’s capable of doing something you didn’t anticipate. So we don’t know how to test for bugs we haven’t seen before… while progress in AI is really exponential or hyper-exponential in terms of capabilities, progress in safety, progress in our ability to control those systems is linear at best, if not constant.
Speaker
Roman V. Yampolskiy
Reason
This comment introduces the fundamental limitation of current testing paradigms when dealing with superintelligent systems. It highlights the asymmetric progress between AI capabilities and safety measures, presenting an existential challenge to the entire testing framework being discussed.
Impact
This stark assessment created a sobering counterpoint to the more optimistic discussions about benchmarking and standards. It forced the conversation to confront the possibility that current approaches may be fundamentally inadequate for future AI systems, adding urgency and philosophical depth to the technical discussions.
While the major effort is turned to evaluate and benchmark AI models, we need to move as quick as possible to evaluate AI solutions, AI products, and AI features… This is what the end users and the society will get to use at the end of the day.
Speaker
Boulbaba Ben Amor
Reason
This comment identifies a critical gap between academic/research focus on models versus real-world deployment of AI products. It shifts attention from theoretical capabilities to practical implementations that actually affect society, highlighting the disconnect between testing approaches and user reality.
Impact
This observation redirected the conversation toward practical implementation challenges and influenced the discussion to consider the full AI product ecosystem rather than just foundational models. It helped bridge the gap between technical testing and real-world deployment concerns.
There’s a really significant difference between, is a technology trustworthy? Does it work in the way we expect it to work? And do people trust in AI technologies? And one of the ways in which we can test for that is public perception and social attitude surveys, which are entirely missing from the African continent.
Speaker
Rachel Adams
Reason
This comment introduces a crucial distinction between technical trustworthiness and social trust, highlighting that testing must include societal acceptance measures. It also reveals a significant data gap in understanding global public perception of AI, particularly in underrepresented regions.
Impact
This comment expanded the scope of ‘testing for trust’ beyond technical metrics to include social dimensions, adding a new layer to the verification challenge. It reinforced the theme of global inclusivity while introducing empirical social research as a necessary component of AI governance.
Overall assessment
These key comments fundamentally shaped the discussion by challenging basic assumptions about AI testing and governance. Rachel Adams’ observations about circular testing and global representation established the theme of inclusivity and bias that ran throughout the panel. Anja Kaspersen’s conceptual deconstruction of ‘trust’ provided the analytical framework that elevated the discussion from abstract concepts to concrete technical requirements. Roman Yampolskiy’s stark assessment of the limitations of current approaches added philosophical depth and urgency, while Boulbaba Ben Amor’s focus on products versus models grounded the conversation in practical reality. Together, these comments transformed what could have been a technical discussion about benchmarking into a nuanced exploration of power dynamics, global equity, conceptual clarity, and the fundamental limitations of current AI governance approaches. The discussion evolved from optimistic problem-solving to a more sobering recognition of the complexity and urgency of the verification challenge in AI.
Follow-up questions
How can different parts of the world collaborate and interact with each other to effectively test AI systems and ensure their reliability?
Speaker
Robert Trager
Explanation
This was identified as a key discussion point from the workshop sessions, highlighting the need for global cooperation in AI testing frameworks
How can we move from research to pre-standardization, and then from pre-standardization to standardization in AI testing?
Speaker
Robert Trager
Explanation
This represents a challenging pipeline that requires ongoing work to establish effective standards in the rapidly evolving AI space
How can multistakeholder, multidisciplinary approaches to testing help develop more trustworthy technology that reflects diverse local realities and conditions?
Speaker
Rachel Adams
Explanation
This addresses the concern that current testing processes are centered in specific parts of the world and may not adequately represent global perspectives
How can we develop evaluation assets that reflect distinct local realities, conditions, and risks across different parts of the world?
Speaker
Rachel Adams
Explanation
This is critical because AI technologies may fail differently in various global contexts, requiring localized testing approaches
How can we achieve better harmonization of standards to reduce fragmentation in AI standardization?
Speaker
Boulbaba Ben Amor
Explanation
Current standardization efforts need convergence while maintaining flexibility to adapt to technological advancement
How do we develop specific benchmarks for different industry sectors (healthcare, education, defense, oil and gas, finance)?
Speaker
Boulbaba Ben Amor
Explanation
Each field requires different benchmarks as AI solutions are adapted to serve various industries with distinct requirements
How do we adapt trustworthy methodology to test agentic AI solutions that have more autonomy to make decisions without human intervention?
Speaker
Boulbaba Ben Amor
Explanation
As AI moves toward more autonomous systems, new testing methodologies are needed to ensure safety and reliability
How can policymakers and technologists collaborate to establish common understanding of technical terminology used in AI governance?
Speaker
Anja Kaspersen
Explanation
Terminology differences between technical and policy communities create barriers to effective verification efforts
How can we ensure that governance evolution keeps pace with rapid technological development to avoid parallel, disconnected tracks?
Speaker
Anja Kaspersen
Explanation
There’s concern that governance may ‘asymptote off’ and fail to keep up with technological advancement
How can we better integrate technical communities into governance conversations before deployment and before governance principles are set?
Speaker
Anja Kaspersen
Explanation
Technical communities are often not included in critical decision-making processes when their input would be most valuable
How can we conduct public perception and social attitude surveys on AI across different continents, particularly in underrepresented regions?
Speaker
Rachel Adams
Explanation
Understanding public trust in AI technologies requires comprehensive surveys that are currently missing from many regions, including Africa
How can we develop technical capabilities to monitor live training runs of large, complex AI systems?
Speaker
Roman V. Yampolskiy
Explanation
Current inability to monitor systems during training and the long testing periods required create challenges for implementing legal and governance measures
How can we test for unknown capabilities or ‘bugs we haven’t seen before’ in AI systems that may be smarter than humans?
Speaker
Roman V. Yampolskiy
Explanation
Traditional testing paradigms may be insufficient for systems that exceed human intelligence and creativity
How can we move from evaluating AI models to evaluating complete AI solutions, products, and features?
Speaker
Boulbaba Ben Amor
Explanation
End users interact with complete AI solutions rather than just models, requiring focus on product conformity and risk assessment for integrated systems
How can we better detect the moment when dangerous capabilities arrive in AI systems rather than just checking for reliability?
Speaker
Chris Painter
Explanation
Current testing focuses on detecting emerging dangerous capabilities rather than ensuring systems are safe, which is an important distinction for policymakers to understand
Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.
Related event
