Ensuring Safe AI_ Monitoring Agents to Bridge the Global Assurance Gap

20 Feb 2026 14:00h - 15:00h

Ensuring Safe AI_ Monitoring Agents to Bridge the Global Assurance Gap

Session at a glance

Summary

This discussion focused on AI assurance and the challenges of ensuring AI systems, particularly emerging agentic AI, are safe and trustworthy on a global scale. The panel was convened by Partnership on AI following the adoption of the Delhi Declaration at India’s AI Impact Summit, which included commitments to strengthening multilingual and contextual AI evaluations.


Minister Josephine Teo of Singapore opened by emphasizing the rapid emergence of AI agents – autonomous systems that can achieve goals rather than just follow instructions – and the need for proactive governance rather than reactive regulation. She outlined Singapore’s approach, including government leadership in testing agentic AI and developing a model governance framework, while stressing that safety assurance could become a competitive advantage for companies.


The panelists highlighted significant gaps in global AI assurance capabilities. Fred Werner from ITU emphasized the need for international standards that incorporate principles like trustworthiness and inclusivity, while noting that 2.6 billion people remain offline and may be excluded from current frameworks. Owen Larter from Google DeepMind discussed the technical protocols needed for agents to interact safely and the importance of security measures, while acknowledging risks around agent autonomy and potential misuse.


Vukosi Marivate stressed that assurance frameworks designed in developed countries often don’t translate well to contexts with different languages, data, and deployment conditions, requiring more localized understanding and capacity building. Stephanie Ifayemi from PAI outlined six key challenge areas for closing the global assurance divide, including infrastructure, skills, languages, and risk profiles, noting that different regions may prioritize different safety concerns.


The discussion concluded with calls for AI assurance to become an operational discipline rather than just theoretical, emphasizing the need for continuous monitoring of agentic systems, interoperable evaluation methods, and shared global investment in assurance infrastructure, particularly in the Global South.


Keypoints

Major Discussion Points:

AI Assurance as Global Infrastructure: The discussion emphasized that AI assurance – the process of measuring, evaluating, and communicating whether AI systems are trustworthy and safe – needs to be treated as shared infrastructure that requires global collaboration rather than isolated national efforts.


The Rise of AI Agents and New Assurance Challenges: Participants focused extensively on agentic AI systems that can autonomously achieve goals and take actions, noting that these systems introduce new risks and complexities that existing assurance frameworks aren’t equipped to handle, particularly around security, interoperability, and continuous monitoring.


Closing the Global Assurance Divide: A central theme was addressing the significant gaps between Global North and Global South capabilities in AI assurance, including differences in infrastructure, language support, risk priorities, and technical capacity for conducting evaluations and third-party assessments.


From Reactive to Proactive Governance: The conversation highlighted the need to shift from reactive regulation to proactive preparation, with governments and organizations testing AI systems themselves, developing standards, and building assurance ecosystems before widespread deployment.


Multilingual and Contextual Evaluations: Following the New Delhi Declaration’s commitments, participants discussed the critical need for AI evaluations that work across diverse languages, cultures, and real-world deployment conditions, particularly given the linguistic diversity in regions like India (120+ languages) and Africa (1,500-3,000 languages).


Overall Purpose:

The discussion aimed to examine how to build robust, globally inclusive AI assurance systems in the era of increasingly autonomous AI agents. The session sought to move beyond theoretical frameworks toward practical, operational approaches that could work across different countries, languages, and deployment contexts while addressing the significant capacity gaps between different regions.


Overall Tone:

The tone was collaborative and solution-oriented throughout, with participants acknowledging both the urgency and complexity of the challenges. Speakers maintained a pragmatic optimism, recognizing significant obstacles while emphasizing opportunities for cooperation and shared responsibility. The discussion was notably inclusive, with consistent attention to ensuring Global South perspectives weren’t marginalized. The tone remained constructive and forward-looking, ending with clear calls to action for continued collaboration and implementation of the ideas discussed.


Speakers

Speakers from the provided list:


Rebecca Finlay – Works at Partnership on AI (PAI), appears to be in a leadership role focused on AI governance and policy


Josephine Teo – Minister from Singapore, involved in AI governance and assurance frameworks


Madhu Srikumar – Works at Partnership on AI (PAI), moderator of the panel discussion, involved in AI safety work


Frederic Werner – Works at ITU (International Telecommunication Union), involved with AI for Good initiative


Owen Larter – Works at Google DeepMind, focused on AI safety research and industry perspective on AI assurance


Vukosi Marivate – University of Pretoria, built Masakane for African Language NLP, co-founder of Lilapa AI startup, focused on AI for African contexts


Stephanie Ifayemi – Works at Partnership on AI (PAI), involved in research on global AI assurance divide


Natasha Crampton – Chief Responsible AI Officer at Microsoft


Chris Meserole – CEO of FMF (organization not fully specified in transcript)


Additional speakers:


None identified – all speakers mentioned in the transcript were included in the provided speakers names list.


Full session report

This comprehensive discussion on AI assurance took place as the final session of India’s AI Impact Summit following the adoption of the New Delhi Declaration, which included commitments to strengthening multilingual and contextual AI evaluations. The session, convened by Partnership on AI, brought together diverse perspectives from government, industry, academia, and international organisations to examine how to build robust, globally inclusive AI assurance systems in the era of increasingly autonomous AI agents.


Opening Context and Partnership on AI’s Global Mission

Rebecca Finlay from Partnership on AI opened the session by establishing the context for the discussion, highlighting PAI’s presence across 19+ countries and their commitment to ensuring AI technology serves humanity globally. She introduced two new research papers that would inform the discussion: “Strengthening the AI Assurance Ecosystem” and research on closing the global AI assurance divide. Finlay emphasised that the session aimed to move beyond theoretical frameworks toward practical implementation, bringing together both “thinkers and doers” to address the urgent challenges of AI assurance in an era of increasingly autonomous systems.


The session was structured around the recognition that as AI systems become more agentic—capable of autonomous goal achievement rather than merely following instructions—traditional assurance approaches require fundamental rethinking. Finlay noted that this transformation demands new approaches to testing, monitoring, and governance that account for the complexity of systems that can plan, adapt, and interact with minimal human oversight.


Singapore’s Proactive Approach to Agentic AI Governance

Minister Josephine Teo delivered a keynote address outlining Singapore’s pioneering approach to agentic AI governance. She highlighted the rapid emergence of AI agents, noting that these autonomous systems were barely discussed just twelve months ago at previous AI summits, yet now represent a fundamental shift in how AI systems operate. Unlike traditional AI that follows instructions, agentic AI can achieve goals independently, introducing new risks when human oversight is diminished or absent.


Singapore’s approach exemplifies proactive governance through government-led testing of agentic AI in high-stakes citizen services. Minister Teo emphasised that mistakes in areas such as health, social security, or benefits can have serious consequences when systems not only provide information but act upon it. The country has developed a model governance framework for agentic AI and established partnerships with companies like Google to create testing sandboxes, embodying the principle of “eating our own dog food” to establish credibility in governance.


Minister Teo outlined three essential components for robust AI assurance ecosystems: comprehensive technical testing that accounts for the complexity of multi-agent systems and their reasoning processes; standards development to define safety and reliability expectations; and third-party assurance providers who can independently attest to system safety and identify blind spots. Importantly, she reframed AI safety assurance as a potential competitive advantage rather than merely a compliance burden, arguing that companies demonstrating high safety assurance will differentiate themselves in the marketplace.


Panel Discussion: Global Perspectives on AI Assurance Challenges

Moderated by Madhu Srikumar, who defined AI assurance as “the process of measuring, evaluating, and communicating whether AI systems are trustworthy,” the panel discussion brought together diverse international perspectives on the challenges and opportunities in building global AI assurance systems.


International Standards and Global Connectivity Challenges

Frederic Werner from the International Telecommunication Union highlighted the stark reality of global disparities in AI assurance capabilities, noting that he has “yet to see high potential use case developed in Brussels work equally well in Johannesburg and Shenzhen and maybe Panama.” This observation underscores the fundamental limitation of assuming AI solutions are universally transferable without considering local contexts, languages, and deployment conditions.


Werner emphasised that whilst there are numerous high-potential AI applications for addressing global challenges—from affordable healthcare to disaster response—the trust element remains problematic, and scalability across diverse contexts has not been achieved. He stressed that standards serve as crucial tools for translating ambitious governance principles into actionable details, particularly as businesses begin operating internationally with AI systems.


The connectivity challenge adds another layer of complexity, with 2.6 billion people remaining offline globally. Werner noted that even where connectivity exists, there may be insufficient incentive to connect without relevant content in local languages or useful applications. However, he highlighted AI’s potential to remove friction barriers related to literacy, disabilities, or language barriers, though he cautioned that increased access doesn’t guarantee positive outcomes without proper education and capacity building.


Industry Perspectives on Agentic Systems and Technical Standards

Owen Larter from Google DeepMind provided an industry perspective on the technical requirements for robust AI assurance, particularly for agentic systems. He described agents using the example of a system that can autonomously achieve goals—such as arranging dry cleaning pickup by Friday—rather than requiring step-by-step instructions. These systems are already being deployed across various applications, including coding assistance and other automated tasks.


Larter emphasised that the emerging agentic economy requires new technical protocols for agents to communicate with each other and with websites, similar to how HTTP and URL standards underpinned the early internet. Google has launched initiatives including the agents-to-agents protocol and universal commerce protocol to establish standardised information exchange methods. However, he acknowledged significant security concerns, particularly when autonomous systems connect to sensitive accounts such as email or banking services.


The security challenges extend beyond access control to include potential misuse of new capabilities that agentic systems might create, particularly in cybersecurity domains. Larter described Google DeepMind’s frontier safety framework for testing models before deployment and their collaboration with security operations teams to scan agentic applications for malware and vulnerabilities. He also highlighted the importance of developing compute-efficient models to ensure broader access to agentic systems and enable more rigorous testing globally.


Global South Perspectives and Contextual Requirements

Vukosi Marivate from the University of Pretoria and co-founder of Lilapa AI startup provided crucial insights into how AI assurance frameworks developed in major tech centres fail to address Global South contexts. He emphasised that effective assurance cannot be purely top-down but requires local understanding and capacity among policymakers to implement appropriate labour laws, data governance, and system monitoring. Without this local capability, AI systems may operate in ways that contradict local values and needs.


Marivate highlighted the complexity of implementing “test once, comply globally” approaches, noting that whilst high-level safety standards may be achievable, individual user experiences require localised understanding and testing. His experience building systems for African contexts demonstrates the importance of developing AI “for Africans by Africans” rather than adapting solutions designed elsewhere. He stressed that meaningful assurance requires bottom-up engagement and cannot rely solely on frameworks imposed from global institutions or developed countries.


The linguistic diversity challenge is particularly acute in regions like Africa, which has approximately 1,500 to 3,000 spoken languages, and India, with over 120 languages and 19,500 dialects. This complexity makes designing comprehensive evaluations that account for diverse language ecosystems extraordinarily challenging, yet essential for meaningful AI assurance in these contexts.


Research Insights on Closing the Global Assurance Divide

Stephanie Ifayemi from Partnership on AI presented findings from their research on closing the global AI assurance divide, identifying six key challenge areas: infrastructure, skills, languages, risk profiles, access barriers, and capacity constraints. The infrastructure challenge is particularly significant, with comprehensive AI evaluations requiring substantial computational resources that create barriers for countries with limited access.


However, Ifayemi noted that even developed countries face access challenges, with the UK’s Department of Science, Innovation and Technology prioritising model access for assurance purposes by 2026. This suggests that access limitations affect the entire global AI assurance ecosystem, not exclusively Global South countries.


The research revealed important differences in risk prioritisation across regions, highlighting the need for assurance frameworks that account for varying local priorities and values rather than applying universal standards uniformly. Ifayemi discussed tiered approaches to agent assurance based on risk levels and reversibility, recognising that different applications require different levels of oversight and intervention capabilities.


Ifayemi emphasised the critical importance of ensuring Global South countries participate in forward-looking frameworks such as agent standardisation rather than constantly playing catch-up. She pointed to recent announcements from the US National Institute of Standards and Technology about standardising agent work as opportunities for inclusive participation that could prevent further marginalisation of developing countries.


Technical Evolution and Continuous Monitoring Requirements

The discussion highlighted fundamental changes required in AI assurance approaches for agentic systems. The shift from traditional pre-deployment testing to continuous monitoring becomes critically important for systems that can plan, chain actions, interact with tools, and adapt over time. This requires new approaches to real-time detection capabilities and clear accountability frameworks for intervention decisions.


Speakers emphasised that assurance must be built into system development lifecycles rather than retrofitted, requiring systems to be designed for observability, auditability, and constraint enforcement from the outset. The technical challenges of monitoring complex multi-agent systems remain significant unresolved issues, as traditional evaluation approaches may be insufficient for systems that exhibit emergent behaviours through agent interactions.


Collaborative Frameworks and Shared Infrastructure

Throughout the discussion, speakers emphasised that effective AI assurance cannot be achieved by individual organisations working in isolation but requires coordinated collaboration across sectors, organisations, and stakeholders globally. This collaborative imperative extends beyond technical cooperation to include shared investment in capacity building, particularly in the Global South, and the development of interoperable evaluation methods that can work across different regulatory and cultural contexts.


The conversation highlighted the need for changing incentive structures to support assurance adoption, including exploring insurance mechanisms that reward robust assurance practices and professionalising the assurance provider ecosystem through accreditation and standardisation. Such systemic changes could help mature the global assurance ecosystem beyond current ad hoc approaches.


Closing Perspectives and Future Directions

Natasha Crampton from Microsoft delivered closing remarks emphasising the shift towards continuous monitoring and real-time detection capabilities required for agentic systems. She stressed that assurance must be treated as shared infrastructure requiring collaborative building and implementation across the global community, with systems designed for observability and accountability from the outset.


Chris Meserole provided final summary remarks, identifying three core themes from the discussion: the need for inclusive global participation in AI assurance development, the importance of balancing standardisation with local contextualisation, and the urgency of developing new technical approaches for monitoring autonomous systems. He concluded with a call to action for participants to download Partnership on AI’s research reports and actively engage in building the collaborative frameworks necessary for effective global AI assurance.


Implications for Global AI Governance

The discussion revealed that effective AI assurance requires fundamental shifts in how the global community approaches AI governance. Rather than relying on frameworks developed in major tech centres and applied globally, the conversation highlighted the need for inclusive, bottom-up approaches that account for diverse contexts, priorities, and capabilities.


The emphasis on treating assurance as shared infrastructure suggests new models of international cooperation may be needed, potentially involving novel funding mechanisms, technology transfer arrangements, and capacity building partnerships. The recognition that companies demonstrating strong safety assurance may gain competitive advantages also suggests market-based incentives could complement regulatory approaches.


Despite broad consensus on the importance of global AI assurance, several critical challenges remain unresolved. The tension between standardisation and localisation presents ongoing difficulties, while resource allocation decisions require careful consideration of computational requirements and capacity building needs. The technical challenges of continuous monitoring for complex multi-agent systems represent frontier research areas requiring significant investment and innovation.


This comprehensive examination of AI assurance challenges and opportunities provides a foundation for understanding the complex technical, social, and governance issues that must be addressed as AI systems become increasingly autonomous and globally deployed. The discussion demonstrates both the urgency of these challenges and the potential for collaborative solutions that ensure AI development benefits all of humanity rather than exacerbating existing inequalities.


Session transcript

Rebecca Finlay

in 19 -ish countries, and we’re all focused on what does it mean to unlock innovation through trustworthy, responsible, beneficial AI. And so, of course, no surprise, gatherings like the one that we’ve had this week are really crucial for the work we do, and with the Delhi Declaration adopted yesterday, this is an even more important moment to build on where we have come from, to lean in, and to really get to work around some of the questions of the accountability work that needs to be done, the scientific evidence that we need to build around frameworks and good policy moving forward. And, of course, it’s extraordinarily important that this is happening in India, that it’s bringing a whole set of voices and perspectives and leadership that is not optional.

At PAI, we believe… We believe that that is fundamental to building a global community committed to this work, and it’s great… to see it in action this week. So thank you all for being here with us. So today we’re going to give you an opportunity to see two of our latest papers. These are papers that were begun out of the Paris Action Summit. And at that time, as we were thinking about moving into action and invasion, we felt that work needed to happen with a good sense of what the assurance ecosystem looked like. So we’ve had working groups underway developing these two new resources. They’ll be up on the screen at some point. You’ll be able to get a QR code and download them.

Feel free to talk to any of us. The first one is Strengthening the AI Assurance Ecosystem. It really looks at telling and helping national policymakers, if you’re building a robust industrial AI strategy, you better have a comprehensive AI assurance strategy as well. And you need to be able to do that. And so we’re going to be talking about that. We need to think about all those actors and what they look like. We’re going to hear about one of the experts, of course, in this as soon as the minister comes to join us. The second piece, which is really important, we think, for this conversation is what does it mean to do AI assurance? globally around the world?

How do we close the divide that exists? What is different about the challenges faced by countries in the Global South versus others? So we’re really hoping that these resources not only are good, substantive contributions to the work that needs to be done, but the idea is to just catalyze, you know, sort of plant a number of seeds across a number of ways in which assurance works so that those can grow and really come to life out of this. And just two quick comments on that. Now that we have half the declaration, and so now we can, as opposed to earlier in the week, start to articulate it, really leaning in with regard to the commitments around, in commitment one, around usage, clarity around usage data, really trying to give some empirical grounding to this work.

In 2025, in our progress report around foundation model, impact. We made exactly this recommendation. We directly called for Frontier AI companies to share usage data. We’ve been tracking progress, and there has been some progress in that regard. So we are delighted to see this particular commitment to come about and to start to see some standards about how that usage data is going to be shared. So we’re very pleased to see that work. We’re also very pleased to see the second commitment around strengthening multilingual and use case evaluations. And you’ll see, if you do download the report on the global assurance divide, that that is clearly a key piece of work that needs to happen. So this afternoon, we are going to give you an extraordinarily expert panel that brings a real diversity of perspectives to this work.

And so we want to take the assurance question and apply it to agents. Because that’s where the world is going. We’re all seeing them in the news every day. We’re seeing them integrated into foundation model systems. So what does it mean? to take what we know about assurance and think about the applications that agents will add to the complexity of that work. So let me begin by introducing our first speaker. She’s probably been one of the most visible ministers this week because of the extraordinary leadership that Singapore has taken when we think about AI assurance. I know you’re going to talk a little bit about that. Such a pleasure to welcome you, Minister Josephine Teo.

She’s going to come and say some words for us before the panel begins. Thank you.

Josephine Teo

Thank you very much, Rebecca, and also very much appreciate Partnership on AI for the invitation. When this series of summits first began in Bletchley, AI agents were not a thing. Nobody was talking about them, even just 12 months ago. When we had the AI Action Summit in Paris, it has barely crept into the conversation at the time. the preoccupation was all around DeepSeq and what it told us about the capabilities that is emerging out of China. But today, as Rebecca correctly identified, agentic systems have taken off. They are increasingly being used and we need to have a better grasp on how to deal with this issue because agentic AI certainly offers transformative possibilities in how we delegate and orchestrate work when deployed strategically.

Agents functions as invaluable teammates, unlocking productivity gains and time savings, which we all want more of. However, I should also add that this autonomy, the very nature of how agents can be helpful to us is autonomy. This autonomy also introduces new risk. The potential for harm increases when systems malfunction and human oversight is normalized. We are no longer present. or at least diminish to a very large extent. The implications may be complex and not fully predictable. So the way my colleagues and I have been thinking about this is that there needs to be a shift. There needs to be a shift in terms of how we might want to rely on reactive regulation to a different kind of stance, which is proactive preparation.

And in Singapore, that’s what we’ve been trying to do. We’ve tried to be proactive about governing the new risks in the era of agentic AI. And I think it starts with the government itself being a leader and not a laggard in using agentic AI. We need to test it. We need to look at how the solutions can not only enhance public service delivery, But we also need to be able to put in place more controls. Government is high risk because the touch point with citizens are very sensitive. No citizen and no government wants to make serious mistakes when they interact with their citizens, telling them things about their health, telling things about their social security, telling them about things to do with their benefits that are not accurate, and having them not just being told but acted upon.

So this need to ensure that we know what we’re doing is a very high one. And the way we are also thinking about it is to try and work with industry. So, for example, between Google and Singapore government, we have a sandbox on agentic AI. It’s one of the ways. We think we can, in a way, eat our own dog food. Try it. You know, does it taste all right? hurt us in a very significant way because if we were not able to do so, I don’t think we have a lot of credibility in terms of how we want to govern agentic AI. But we can’t wait, you know, for the dog food to materialize in its consequences for ourselves.

In the meantime, my colleagues have put together a model governance framework for agentic AI. It is meant to provide practical support to enterprises so that they can also deploy autonomous agents responsibly and to mitigate the risk. We know that this is not a complete solution and this document that we put out has to be a live document. We very much encourage feedback and as a way for us to keep improving the guidance to enterprises. Can I also just add that as we do this work, what is the… meaning and what is the purpose behind it. Ultimately, it is to build confidence in the use of agentic AI systems. And we think that at many levels, this confidence has to be presented, has to be demonstrated to boards of organizations, to customers, to other stakeholders.

And how do we demonstrate that the risks have been managed well? And that is where the assurance ecosystem that Rebecca talks about comes in. It is an absolutely essential part of building trust over the medium to longer term so that there is a way, a foundation upon which agentic AI systems can be made more readily adopted and available. I should also say that for companies that are thinking about it, and I see Microsoft here, and I’m sure that there are other companies represented. If we are to trust these agentic systems, the safety aspects should not be downplayed. And I would venture to say that a company that is able to give a high assurance on safety will find itself being differentiated from their competitor.

It’s more likely to translate into stronger interest in a product and service. So rather than think of it as something that you are unhappy to comply with, think of it as a strategic competitive advantage. And that is a way I think that will give us the confidence to put it forward. The question, however, is that are we completely without experience in this regard? And the answer is no. In aviation and healthcare, there are a lot of measures being put in place to give assurance to passengers. When we board a plane, we usually expect to arrive. when we visit the hospital, we generally expect to be treated, except for disease conditions that are not yet well understood.

But the trust in these systems have to be built over time, and they don’t come without some assurance being put in place. The question is for AI, and specifically agentic AI, what would be the components? What leads to an assurance ecosystem system that would be robust enough? We think that there are at least three components. The first is that there must be testing. We need some way of making sure that there are technical assessments of the system to make sure that the systems are robust, they are reliable, and they’re safe. And a lot more work needs to be done in this space, developing the testing methodology, building the testing datasets, and also making sure that the testing of agentic systems take into account that these systems are robust.

These systems are going to be much more complex than multi -agents, for example, and it’s not just the output, but the in -between steps, how the reasoning takes place, and what is the orchestration that is being built into the GenTech systems. So that’s the first, testing. Second is that eventually we will need standards. We cannot just define what is good enough. We also need to assure the users that it has met expectations in safety and reliability, and so these are still very early days. Thirdly, we think that this ecosystem cannot do without third -party assurance providers. It’s one thing to claim that your agentic AI system is safe. It’s another thing to have someone attest to the safety of it.

So these could be technical testers, auditors, and they provide independence, augment in -house capabilities, and also help to identify the blind spots, and it’s necessary for us to strengthen this pool as well. So I’m going to stop here. I want to conclude my remarks to say that Singapore is actively building these components. and we welcome conversations with partners and colleagues because we know that we cannot do this alone. So we look forward to discussions in the three panels on how we can meaningfully collaborate on assurance for agentic AI. Thank you very much once again, Rebecca.

Madhu Srikumar

Thank you. Thank you. We’re all here. It’s the end of the conference, and we’re all intact. Thank you so much, everyone, for joining us. Thank you, Minister Teo, for the keynote. One quick note before we dive in. Our panelist, Fred, has a flight to catch, so he’ll need to slip away a few minutes early, but, Fred, we’ll make sure we get your best insights before you escape. No pressure. So we are the last session, so we are standing between you and whatever you have planned right after. So I promise we’ll make this worth it. We have an incredible panel and a lot of ground to cover. So before we get started, what do we mean by AI assurance?

Because you’re going to keep hearing that term quite a bit here. So really put simply, AI assurance is the process of measuring, evaluating, and communicating whether AI systems are trustworthy. Are they safe? Do they work as intended? Can the public actually trust them? So really think of it like a safety inspection, but for AI. You wouldn’t want, you know, you’d want an independent inspector checking a building. Not just the builder saying, trust me, it’s fine. So really, AI assurance is about independent verification, as Minister Teo went over. And why this panel? Why now? So the summit unveiled the New Delhi Frontier AI commitments just yesterday. And the second of those commitments is about strengthening multilingual and contextual evaluations.

So really making sure AI systems work across languages, cultures, and real world conditions. And really, that’s the assurance challenge in a nutshell. And our panel today is about whether we are actually equipped to deliver on that promise globally and not just in a handful of countries. So really, our panelists span the ITU, Google DeepMind, the University of Pretoria, and PAI. So we have the range to actually wrestle with this question. So with that, I’m going to get into our first question for today. Fred, that’s going to be you. ITU has been convening on AI governance through AI for Good and working on standards across borders. So really, when we talk about AI assurance, what does it mean to you, ensuring that these systems are safe and trusted?

And how do we think about assurance when 2 .6 billion people remain offline and may be excluded from the frameworks being designed?

Frederic Werner

Yeah, thanks for that great question, and thanks for having me here. So I think that safe to save is no. There’s a huge shortage of high -potential AI for Good use cases, everything from affordable health care to education for all. food security, disaster response, and also looking at more applications in the physical manifestations of AI that you see in robotics, embodied AI, brain -computer interface technologies. The best part of my job at AI for Good is I see these use cases coming across my desk every day. And I can tell you when we started AI for Good in 2017, it was mainly in PowerPoint slides. They didn’t really exist. But as we got into, say, the 2023 with GenAI, last year, the unofficial theme of AI for Good was the rise of the AI agents, a bit scary, Terminator -like, but that’s what people were talking about.

And we’re really going from sort of the promise to the pilots to the use cases and now scaling. Now, when you’re looking at these use cases, I think one big challenge is trust. How do you trust them? I mean, there’s always the good intention, right? But is that trust there? And also, are they replicable and scalable? And I’ve yet to see, you know, high potential use case developed in Brussels work equally well in Johannesburg and Shenzhen and maybe Panama. Like, it’s just, we haven’t really reached that yet. And if you look at these sort of fast -emerging governance frameworks around the world, whether you’re in the U.S. or EU or China or everything in between, I think there’s a lot of good intentions, a lot of good thinking.

But how do you turn those ambitious words and principles into actions? Because the devil is in the details, and I think standards have details. So when you’re thinking about how do you – especially when you start to get into AI agents and you really – that trust element is becoming ever more critical, how can you bake in a lot of the common sense things that we’ve been talking about all week or even for the past years at AI for Good? Are they trustworthy? Are they verifiable? Are they secure? Are they safe? Are they designed with human rights principles in mind? Are they inclusive? Are people from the global south appetizing? Are they able when we’re drafting and developing these standards?

So these are not always natural reflexes, and at the same time, it’s hard to turn words into action. So one of the tools, I’m not saying it’s the only tool, but I think as these solutions start to scale and businesses start to interact internationally or even internationally, at one point you’re going to need standards, and it’s within those standards that you can kind of bake in those common sense principles that we’ve been all talking about. And I forget the last part of your question. It was really a question about… Oh, connectivity. That was it, yes. …2 .6 billion people who remain offline, yeah. Yeah. Yeah, so, you know, ITU’s mission is connecting the world, and a third of the world is still offline.

And, you know, large parts of the world actually have connectivity, but there’s actually no incentive to connect. So if there’s no content in your local language or dialect or no access to government services or useful applications that are fit for purpose in where you live… you know there’s why would you connect so i i think ai can actually help to remove that friction where you have a lot of bottlenecks for example literacy disabilities again like content in your own language or dialect so i think one thing is closing the connectivity gap but the other thing is actually using ai to remove that friction and the last thing i would say is i think sometimes there’s a comparison where um if you take east africa for example and you have the the mobile payment miracle or revolution with mpeza right you effectively leapfrog decades of infrastructure legacy infrastructure and there may be a kind of optimism that well the same thing could happen with ai in the global south maybe but i don’t think we can take it for granted that if that happens it goes in the right direction it’s not a guarantee that just by putting the tool in the hands of the people that they’re going to create value they’re going to use it responsibly they’re going to use it to solve local challenges build more cohesion and community, but those aren’t for granted.

So I think that whole AI skilling angle of really educating people from grade school to grad school to diplomats and everyone in between, if you don’t address that literacy piece, then it’s just going to be a crapshoot. We’re not sure

Madhu Srikumar

Great. I mean, it’s a good transition. Speaking of standards, Owen, Google DeepMind recently deepened its partnership with the UK AI Security Institute on safety research, so including work on monitoring chain of thought and evaluations. So really from an industry perspective, you know, what does robust AI assurance look like? Where do you think the gaps and opportunities are between what Frontier Labs kind of do internally and what’s needed for broader public trust?

Owen Larter

Yeah, thank you, Madhu. And thank you to Rebecca and Partnership on AI for convening this really important conversation. And a big congratulations to our Indian hosts for a fantastic week at the summit. This week, maybe start talking a little bit about what… agents are, we’re increasingly excited about them at Google DeepMind. They’re essentially more autonomous systems that instead of just following basic instructions can actually achieve goals. So let’s say I want to get my suit dry cleaned on Thursday, instead of taking an AI system and say, find a website for a dry cleaning company, see if it’s open on Thursday, see what the hours are, see if it’s within my budget. You can just say to your agentic system, go find a way to dry clean my suit, make sure it’s being picked up by Friday, and it will go and interact with those different websites and try and find a way to meet your goals.

All kinds of fantastic applications already that we’re seeing right across the economy. We’re using increasingly agentic coding systems at Google and Google DeepMind to do a lot of our coding. So we have our anti -gravity framework, which is fantastic. You can interact with it in normal, natural language and say, build me a website, build me a tracking system to follow a particular bill that I’m interested in, and it will really help you achieve these goals. I think you’ll increasingly see agents used right across the economy as well. I think we’re just in the early years of a new AI enabled agentic economy. I think you will have very normal interactions with agents on a regular basis that will pop up on your phone screen and say, hey, it’s been a few weeks since you bought toothpaste.

Would you like me to go and take care of you and get some more toothpaste for you? You mentioned standards, which I think is going to be a critical part of getting all of this right. There’s a couple of dimensions to the standards. So firstly, we need to create the sort of technical protocols to actually underpin this agentic economy. So we’ve been trying to contribute to this conversation. There is the agents to agents protocol that Google has launched. There’s the universal commerce protocol. This is basically a way of helping agents talk to each other and agents talk to websites so that you have standardized sets of information. An agent will basically come to an agent or an agent will come to a website and say, this is my ID.

These are my capabilities. These are what I’m trying to do. I think in the same way that we developed protocols and standards in the early 90s to underpin the internet like HTTP, like URL, we’re going to have to build these out. There are then also assurance standards, which are related, but I think very important as well. We need to make sure that we’re understanding the capabilities of these systems. We need to keep making progress on how we can test for the risks that they may pose and then work right across society to come up with ways to mitigate that. I think the work that the safety and security institutes are doing around the world is absolutely critical.

So Minister Teo mentioned some of the work that we’re doing in Singapore. The UK Security Institute has been world leading on this. I think this is an area that we’re going to see more from the ACs and KCs right across the world. The US government also, through their KC, launching an agent standards initiative this week as well.

Madhu Srikumar

Great. And if you don’t mind a follow up question, that’s a really important point that you pointed out, that we currently need interoperability. We need agents to flourish. We need to find a new way to kind of imagine this paradigm. But I’m curious if there’s a safety challenge when it comes to agents. Instead. yeah that keeps you up at night

Owen Larter

yeah i think there are definitely risks to be mindful of so i think agent security is something that we should all be thinking a lot about if we’re connecting increasingly autonomous systems into different accounts different email accounts different bank accounts i think we want to be pretty careful about how we do that and come up with superior security protocols and that can be helpful there we’ve actually been doing some work with virus total which is part of the the google security operations team at google to make sure that when certain agentic systems are downloading skills or downloading apps from agentic websites they’re being scanned for malware or vulnerabilities that are being detected so that they can be addressed before people put them onto their their computer i think there’s also a concern that these agentic systems could create new capabilities that could be misused so across the cyber security dimension domain for example i think some of the frameworks that we have already at google deep mind will be helpful here so we have our frontier safety framework which we use to test models before we put them out into the real world.

We think about how those models are going to interact with systems, how they might be parts of agents as we’re doing that work.

Madhu Srikumar

All right. Just speaking for myself, I can’t wait to use agents. I feel like it’s a lot of developer communities that have, you know, started playing around with these systems. But I imagine it’s reaching lay consumers very soon. So, Vukosi, you have built Masakane for African Language NLP. Really building AI for Africans by Africans. When assurance frameworks are designed in the U.S., U.K., or Singapore, how well do they translate to context where the data, the languages, the deployment conditions are completely different? What do we think we’re missing?

Vukosi Marivate

that we do get to understand that it’s a very different thing. My experience has been that there’s likely not as much collection in Europe or North America or annotation as much as is happening now in the global south. But then that also means that it feels like it’s further away, right? It’s not where the developers are. And that then requires more of this conversation in one place. So that, again, there must be kind of a local understanding. The last piece to that is going to be the capacity and the capabilities of then the policymakers in those countries to be able to understand that part. It will not be top -down. I don’t believe that. It will be them understanding whether it’s labor laws, it’s data governance, it’s just monitoring of systems once they’re on.

If there is not that capacity or capability to actually do those things, again, it’s more automated. direction that is not necessarily what the values of those people actually are.

Madhu Srikumar

Those are important words right at the end of the conference, knowing just how much we have to get done here. So Steph, over to you. PAI just released work on closing the global assurance divide, a lot of what Bukosi just mentioned. What are the concrete gaps you’re identifying? Identifying? Is it capacity to conduct third -party evaluations, as Minister Teo mentioned? Is it access to the models being tested, or is it something else? What would it take to really close those gaps?

Stephanie Ifayemi

Awesome. Thanks so much, Maru. And as one of the PAI folks, thanks for being here, everyone. It’s great to see you all. I know it’s a Friday evening, so we’re in between you and cocktails or whatever you have planned, so we very much appreciate it in the last session of the day. So I think it’s such a good question, and I think your question talks about some things that recognize that those challenges aren’t actually just Global South Challenge. I just want to start with the fact that we’ve released two papers. One is on closing the assurance divide, and the other is how we strengthen the global assurance ecosystem generally. And the question of access is one that impacts us all, actually.

In the UK, for example, the Department of Science, Innovation and Technology, I believe that’s what DSET stands for, has made access to models as a means to support insurance a priority for 2026. And so I think that there are a few shared challenges, and I’ll come back to the point around north -south, actually, collaboration in a second. But just thinking about closing the AI insurance divide, we released this paper, and in it we talk about around six challenge areas, from infrastructure to skills. We talk about languages and risk profiles, so the things that you’ve heard about from Vukosi and a lot of the other speakers. So I’ll give you a sense of some of the examples that we have.

So on language, we’re at the India Summit, of course, and India has over, I believe, 120 languages and 19 ,500 dialects. When we think about Africa, we have about… 1 ,500. or 3 ,000 spoken languages in itself. So when we think about benchmarking and evals and designing evals that think about how those systems are deployed in these various contexts, it’s so important to think about languages, and that just generally, I think, demonstrates the complexity of designing evals to meet the needs of this kind of diverse language ecosystem. Rebecca mentioned at the start that we had the declaration, of course, yesterday, and the commitment therefore in the declaration to multilingual evals is really critical. Of course, there’s still a lot of work to determine how do we actually do that in practice in the most effective way, and accounting for that complex and wide language diversity, but that’s one area that we talk about.

The second in terms of closing the assurance divide that we need to account for is risk profile, interestingly. in this paper, we actually interviewed a lot of assurance and safety experts internationally. And one of the things that they mentioned was differences in what they might prioritize when you think about assurance. So when you think about the Pacific Island nations, for example, they would be thinking about assuring for environmental impacts differently than maybe environmental impacts would be considered as important in the US at the moment, for example. Last year, we published a paper on post -appointment monitoring. And in that paper, we talk about sharing kind of data from companies. And one of the points that we talk about is environmental impacts.

And so it’s really interesting that I think in terms of closing the divide, it might the starting point or what you put emphasis on might vary. And that’s important to note as we’re designing things like documentation, description, and so on. And so I think it’s really interesting to see what we’ve kind of focused on. The third I’ll just quickly mention is, of course, infrastructure. I think we’ve probably all heard a lot about this throughout the summit and this idea of what it means to be sovereign and which parts of the stack to prioritize. And that is really, really important. But there are tradeoffs. So in terms of importance, I was looking at a stat that Stanford’s Helm evaluations used over 12 billion tokens and they required 19 ,500 GPU hours alone.

And so when you think about the kind of infrastructural needs, it’s so it creates barriers for a lot of countries in the global south. But I was at an interesting roundtable, actually, that even Carnegie was convening. And we were talking about the fact that how do you balance assurance needs? Where do you start from across the value chain? So at the moment, a lot of the discussion is kind of upstream. Right. We need to have that infrastructure in place. That’s the point that we need to start with. But how do you do that in parallel and how much of that resource should be put into other foundational tools for assurance, such as documentation artifacts, which is another area that we focus on a lot at PAI.

And so I think there will be a lot of questions around how do you weigh up all these challenges, again, knowing that even kind of the G7 countries, the UK AI Safety Institute started with an inaugural $100 million alone. So that prioritization and balancing is going to be important. The last thing I’ll say, coming back to agents, and I will talk about this a bit more, is the North -South collaboration is a real opportunity as we think about agents. And it’s important that global South countries aren’t always playing catch up. I think that’s a point that has come through for me from the summit, which is that NIST or the Casey, so the Center for AI Standards and Innovation.

And this is almost like a test for me of kind of saying. These names of these institutions through this panel. But they just announced a few days ago that they’re going to be working on standardizing work around agents, including that they’ve released an opportunity to comment on a paper around agent attribution and agent identity, I believe, which is really interesting. And there’s, of course, a lot of push for countries to collaborate. And you see a lot of the safety institutes collaborating on questions around assuring agents in the global north. But how do we ensure that global south countries aren’t missing from that? That will have implications for how we attribute agents, how we test agents.

And we shouldn’t just assume, again, whilst those upstream points and infrastructure is important, that in parallel, they’re ultimately part of these kind of thinking ahead questions and frameworks.

Madhu Srikumar

Great. So I’m going to take the moderator’s prerogative and have us do a rapid fire. And by rapid fire, I mean every answer is a minute and 30 seconds, which, let’s be honest, is fairly rapid for AI policies. I’m going to start with Fred because I’m more nervous about your flight than perhaps you are. So a minute and 30 seconds. What role should multilateral institutions like ITU play in making globally inclusive AI assurance happen?

Frederic Werner

Yes, I think AI for Good has a pretty ambitious goal, right? It’s simply put, it’s to unlock AI’s potential to serve humanity. Pretty big. But we can’t do it alone and no one can. It’s not one country and not one institution, not one NGO. That’s why we have 50 plus UN sister agencies as part of AI for Good, but also making great efforts to bring as many diverse voices to the table from the global south, from NGOs, from civil society. It’s always been extremely open. I like to think of it as the Davos of AI, but instead of being very exclusive, it’s extremely inclusive, right? So I think that’s a bit of a philosophy behind AI for Good.

You know, I think the AI, it’s just moving so quick. So the focus has always been on practical applications, practical solutions. But in doing that, you can tease out the next generation of standards, of policy recommendations, of collaboration and partnerships around the world. So I like to think that in the doing, you have the learning, right? And it’s not just about talking. And that’s what AI for Good has always been all about.

Madhu Srikumar

Thank you. That was incredible. You have 56 seconds left. So, yeah, I’m going to move us ahead to Vukosi. So Singapore’s aim is test once and comply globally. So from a Global South perspective, what would make that interoperability real rather than a form of exclusion?

Vukosi Marivate

Yeah, that’s a hard one. I think going back to I think the other thing that’s come out of a lot of the sessions here has been on the evaluations and how evaluations are used. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. And I think that’s a really important thing. because either on one side it’s going to take you a lot of resources to actually either put up the evaluation to be so all -encompassing on the other side to run it is going to be a lot but then when it comes down to the user which I think was our second panel that I was in this week and you’re trying to think about personalization if you’re going down to an individual what experience do they actually have and how do you get to there?

There will be some more high level safety things that will likely come out and people will be working on that and maybe that’s what I’m thinking Singapore is trying to go for but then when we’re getting to what the individual experience is given that you have the stochastic systems you don’t know what is going to happen necessarily. I know we’re trying to do that but we don’t really know what’s going to happen at the individual experience and we can’t remodel all of that. It’s going to require that again you you do have closer to where the user might be things on what actually that experience was. So one of the hats I wear is I’m a co -founder of Lilapa AI, an AI startup.

And there you will be doing more testing towards, hey, we are serving this client. We’re serving them in this way. And then you’re trying to then go in and say, where is your data coming from? What is the use cases? What are we testing for in terms of their operational kind of requirements? It would necessarily not be just one. But, yes, what you might want is

Madhu Srikumar

Yeah, that’s a great point. Assurance needs to be globally decentralized. Owen, given everything we have discussed, what’s one commitment Frontier Labs should make on assurance that would actually move the needle?

Owen Larter

yeah good question um i think there’s a question of access to the technology which is important here i think it’s one of the big themes of this conference certainly one of the things that i’ll be taking away so you think the the multilingual part of this is really important understanding respecting local cultures that’s important if you’re going to have a good product and if it’s going to be used broadly um we’ve been investing in gemini for some time now to make it better more representative across different languages we have partnerships that we’re doing here in india including with the iit bombay to to help improve performance across various different indic languages it’s also really important on the safety and security front as well to have benchmarks that are available in different languages fantastic work that ml commons are doing on this front that we’re that we’re pleased to support the other bit of access that i think is really important is having things that are quick and cheap enough for everyone to use one of the things of agentic systems is that they’re actually pretty compute intensive to use we have a range of models that we have developed and bringing to market at google deep mind including our very quick flash models which are relatively cheap, quite efficient, very, very quick.

We think these can play a really important role in powering agentic systems. It’s also going to be really important if we’re going to do effective and rigorous testing of these systems because that could be very compute intensive as well. So thinking about that access piece is something we all need to keep doing. And it’s not an easy question, really. I mean, to do it safely and ensuring that third party assurance providers consider the security questions at hand. And it’s an open

Madhu Srikumar

So, Stephanie, no bias at all since we’re both at PAI, but I wanted to give you the final word. What concrete outcomes do you think we want to see from the global AI assurance work in the next 12 months? What would success look like?

Stephanie Ifayemi

So, Owen, now that you said your one point, by the way, can hold you accountable against this delivering on the access question. But. I think we in the two papers, we talk about the need to kind of build a robust assurance ecosystem. And one of those things is changing incentives. So funny enough, another session this week, there was a question about whether we have differences in the way we’re talking about safety over the last few years and whether that we still have those divergences of whether we’ve converged. And there are a few themes that we’ve actually converged on, which is nice. And I think assurance is one of them. And this week, a lot of the discussions we’ve had are in some of those incentive areas like insurance to support assurance.

And so what does that look like? How do we drive new incentives or put some of these structures in place to drive a kind of more mature and robust ecosystem? I think that’s going to be really important. The second is professionalization. There are a lot of questions around how do you trust the assurer? And so how do we ensure that we’re thinking about the skills? What does accreditation look like for assurance organizations or individuals? And so and that will help, I think, questions around kind of access. And so that’s a kind of second piece. But hopefully, I think what we’re what we’re hoping to do. And that’s just because this is also about agents. I think that some of those foundational questions haven’t yet been resolved.

And so I’m hoping that we can move the dial to start thinking about how do you apply that to some of these future questions. So just to shout you out, Madhu. Madhu is the brains behind our safety work. And she came up with a paper on real time failure detection and monitoring of agents. And what I really like about that paper is it talks about a kind of tiered approach to assurance as well. So when you think about agent deployments, do you need to be thinking about assurance based on the risks or the stakes at hand? So is it in the financial services sector? Is it in making about making medical decisions? So how do you tie it as close to the use case and the risks?

And that needs to be also linked to reversibility. What’s the possibility around reversibility of actions and the consequences of that? And then third, we have affordances. What are the kind of affordances you give to the agents? How much autonomy do they have? And so how do you design an assurance ecosystem with all of these different components in mind and a kind of tiered approach? And the more that we can advise, you know, the USKC and a lot of policymakers who clearly are trying to make decisions in this area, I think that’s what success would look like for us.

Madhu Srikumar

This was totally not planned. Steph plugging our work here, but I can’t imagine a better note to end on this. It’s a field wide challenge, but I just want to emphasize the field wide opportunity. No, you know, no one single organization can get this right. So hopefully that’s a helpful reminder as we end with this summit and move on to the next iteration. So thank you, everyone. Hope you have a great. safe flight back home. Fred, that’s tonight for you. And for a closing keynote, I’m going to welcome Natasha Crampton, who’s a Chief Responsible AI Officer at Microsoft. And post that, we’ll hear from Chris, who’s the CEO of FMF. Thanks, everyone. Do you want to give it?

Okay, so we’re going to get mementos. Sorry, you might want to come back. You don’t want to miss this. Thank you very much.

Natasha Crampton

Thanks so much, Madhu, and to all of our panellists for what was, I think, a very rich and grounded and also at times humorous discussion. Thank you. One of the things that came across clearly for me today is that we need AI assurance to no longer just be a theoretical exercise, but we actually need to build it into an operational discipline. And that’s a discipline that really needs to work across borders, across languages and cultures, and I think increasingly across agentic systems, systems that don’t just generate outputs but actually take action. I heard this panelist focus on the fact that assurance is pretty uneven today. It’s often strongest where there’s access to compute and data and evaluation infrastructure, and weakest where those things are scarce.

And as several of our panelists emphasized, if we don’t address that gap deliberately, the shift towards AI agents is only going to make that divide even worse. Rather than closing it. When I think about the nature of assurance, I think with agentic systems, it does need to change in its emphasis somewhat. Pre -deployment testing has always been necessary for all types of systems, and so too has post -deployment testing, of course. But post -deployment testing in an agentic world takes on an even greater level of importance, in my view. When systems can plan and they can chain actions, they can interact with tools, they can adapt over time, assurance really has to move towards continuous monitoring, real -time detection, and clear accountabilities for when interventions need to take place.

That can be quite a hard technical problem, but it’s also a governance challenge. So I know that PAI is known for convening communities of not just thinkers, but also doers. And so I wanted to leave everyone with a couple of ideas of implications that really follow from some of the insights that we heard today. The first is that it’s really important that we build assurance into systems as part of the system development lifecycle. And we don’t just seek to bolt it on at the end. So that means that we need to design systems so that they can be observed and audited and constrained in practice, not just in policy documents. Second, assurance has to be interoperable.

We heard Prime Minister Modi speak yesterday about building in India and delivering to the world. That, I think, is absolutely an aspiration that we should strive towards. But that can only work if we have evidence. Evaluation methods and documents and signals of risk that are usable across regions. Thank you. and adaptable to local languages, cultures, and deployment realities. Third, assurance has to be shared. No single company or government or institution can do this alone. And that’s especially true for agents, given how pervasive they are expected to become across the economy. We need shared evaluation infrastructure, shared taxonomies, and shared investment in capacity, particularly in the global south. So for me, this is why organizations like the Partnership on AI, as well as the many collaborators that have come here together in this week’s India AI Impact Summit, as well as open engagement across the community to make sure that we get this right.

It’s a really foundational area for collaboration for all of us. Now, my view is that if we do get assurance, and by right, I mean it needs to be global and inclusive and also dynamic. I think it really does become an enabler of trust and adoption, as Minister Teo said, not a break on progress. One of the key things that I think we need to do as a community is really to treat assurance as infrastructure, infrastructure that we need to build together and put into practice together. Thanks very much.

Chris Meserole

Well, what a phenomenal session from the opening and closing keynotes to a really rich and dynamic panel. I cannot think of a better way to close out what has been an extraordinarily rich and dynamic summit as well. I have the impossible task of trying to summarize everything that was just said here. So if you’ll bear with me, I’ll just offer kind of three core themes that seem to jump out to me. One is that we need to evolve and mature our understanding of assurance. There’s a lot of reference to agents here, the kind of coming prospect of multi -agent environments as well. We need from evals to mitigations, we need to have a better kind of an evolving understanding of how to do assurance.

Second, and probably more importantly, we also heard a lot about assurance as a global effort. Here I love Steph’s point about the need for greater north -south collaboration. There’s a lot of discussion from Fred and others about the need for global standards and harmonizing those standards and making them interoperable. And then there was also a lot of reference to some of the new institutions that we’ve evolved to enable that global dialogue to happen, whether it’s the institution that was announced literally just before this session an hour ago for the kind of global network or the international network of ACs that have also been kind of revitalized recently as well. And then kind of the last point that really jumped out at me was the assurance as a shared responsibility.

And, Fikosi, I love the point about kind of assurance as a bottom -up effort, and I think it’s one that, you know, we all have a role to play here, regardless of which sector you are in, regardless of what aspect of assurance you’re taking part in, there’s a role for all of us. So with that, I’m going to leave you with just one kind of final call to action, and that is to get involved, right? You know, if we want this technology to be safe and secure and trusted, we all have a role to play. So download the reports, very important thing. Download the great reports that have just come out on this topic. Get involved.

Look at the work that PAI and others are doing as well, and become a part of the conversation about how we’re going to take this amazing technology, but really make sure that it’s safe and secure and that we have a way to trust it. You know, in the opening remarks, Rameca, kind of used this great metaphor of the seed, right? Like one of the goals of the reports that they put out and the conversation in this panel. was to try and plant the seed about, you know, to watch kind of assurance grow. So I guess the parting thought I would give you is to say let’s all kind of roll up our sleeves and get to work and make sure that the seed grows.

So with that, thank you. And thank you as well for our panelists and speakers. Thank you. Thank you. Thank you.

R

Rebecca Finlay

Speech speed

166 words per minute

Speech length

801 words

Speech time

289 seconds

Strengthening AI Assurance Ecosystem

Explanation

Rebecca emphasizes that building a robust AI assurance ecosystem is a priority, linking it to the Delhi Declaration commitments on usage‑data sharing and multilingual evaluation. She calls for concrete actions such as downloading the global assurance divide report to drive implementation.


Evidence

“The first one is Strengthening the AI Assurance Ecosystem.” [8] “And you’ll see, if you do download the report on the global assurance divide, that that is clearly a key piece of work that needs to happen.” [11] “Now that we have half the declaration, and so now we can, as opposed to earlier in the week, start to articulate it, really leaning in with regard to the commitments around, in commitment one, around usage, clarity around usage data, really trying to give some empirical grounding to this work.” [42]


Major discussion point

AI Assurance Ecosystem: definition, components, and need for systematic implementation


Topics

Artificial intelligence | The enabling environment for digital development


J

Josephine Teo

Speech speed

148 words per minute

Speech length

1271 words

Speech time

513 seconds

Government‑Led Sandbox & Core Assurance Components

Explanation

Josephine outlines that a robust assurance ecosystem requires three pillars – testing, standards, and third‑party auditors – and that a proactive government sandbox can translate into stronger product and service interest for agentic AI.


Evidence

“What leads to an assurance ecosystem system that would be robust enough?” [6] “It’s more likely to translate into stronger interest in a product and service.” [51]


Major discussion point

Governance and Safety of Agentic AI


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


M

Madhu Srikumar

Speech speed

146 words per minute

Speech length

1068 words

Speech time

436 seconds

Defining AI Assurance & Multilingual Evaluation

Explanation

Madhu defines AI assurance as independent measurement, evaluation and communication of trustworthiness, and stresses the need for multilingual and contextual evaluations as part of the assurance commitment.


Evidence

“PAI just released work on closing the global assurance divide, a lot of what Bukosi just mentioned.” [2] “And the second of those commitments is about strengthening multilingual and contextual evaluations.” [39] “Is it access to the models being tested, or is it something else?” [35]


Major discussion point

AI Assurance Ecosystem: definition, components, and need for systematic implementation


Topics

Artificial intelligence | Capacity development


F

Frederic Werner

Speech speed

180 words per minute

Speech length

1021 words

Speech time

339 seconds

Global South – Language & Connectivity Gaps

Explanation

Frederic points out the massive offline population, the scarcity of local‑language content, and the opportunity for AI to bridge connectivity and trust gaps, while highlighting AI‑for‑Good’s inclusive model involving UN agencies and civil‑society.


Evidence

“So if there’s no content in your local language or dialect … I think AI can actually help to remove that friction …” [13] “…2 .6 billion people who remain offline, yeah.” [28] “There’s a huge shortage of high -potential AI for Good use cases, everything from affordable health care to education for all.” [24] “That’s why we have 50 plus UN sister agencies as part of AI for Good, but also making great efforts to bring as many diverse voices to the table from the global south…” [25]


Major discussion point

Global South Challenges: multilingualism, infrastructure, and capacity


Topics

Closing all digital divides | Artificial intelligence | Capacity development


O

Owen Larter

Speech speed

201 words per minute

Speech length

1152 words

Speech time

342 seconds

Technical Protocols, Security & Affordable Models

Explanation

Owen stresses the importance of interoperable agent‑to‑agent and universal‑commerce protocols, rigorous testing, security considerations, and the development of cheap, low‑latency “flash” models to enable safe, scalable deployment of agentic AI.


Evidence

“It’s also going to be really important if we’re going to do effective and rigorous testing of these systems because that could be very compute intensive as well.” [22] “There are then also assurance standards, which are related, but I think very important as well.” [31] “I mean, to do it safely and ensuring that third party assurance providers consider the security questions at hand.” [34] “we have a range of models that we have developed and bringing to market at google deep mind including our very quick flash models which are relatively cheap, quite efficient, very, very quick.” [29]


Major discussion point

Governance and Safety of Agentic AI


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


V

Vukosi Marivate

Speech speed

178 words per minute

Speech length

562 words

Speech time

189 seconds

Linguistic Diversity & Local Capacity

Explanation

Vukosi highlights the massive linguistic diversity across India and Africa and the shortage of local annotation resources, calling for country‑level policy and technical capacity building.


Evidence

“PAI just released work on closing the global assurance divide, a lot of what Bukosi just mentioned.” [2]


Major discussion point

Global South Challenges: multilingualism, infrastructure, and capacity


Topics

Closing all digital divides | Capacity development


S

Stephanie Ifayemi

Speech speed

177 words per minute

Speech length

1576 words

Speech time

532 seconds

Closing the Assurance Divide – Six Challenge Areas & Incentives

Explanation

Stephanie outlines six challenge areas (infrastructure, skills, risk profile, documentation, incentives, etc.) and argues that new incentives such as insurance mechanisms are needed to mature the assurance market.


Evidence

“But just thinking about closing the AI insurance divide, we released this paper, and in it we talk about around six challenge areas, from infrastructure to skills.” [1] “The second in terms of closing the assurance divide that we need to account for is risk profile, interestingly.” [5] “But how do you do that in parallel and how much of that resource should be put into other foundational tools for assurance, such as documentation artifacts, which is another area that we focus on a lot at PAI.” [7] “And one of those things is changing incentives.” [38] “How do we drive new incentives or put some of these structures in place to drive a kind of more mature and robust ecosystem?” [52] “And this week, a lot of the discussions we’ve had are in some of those incentive areas like insurance to support assurance.” [48]


Major discussion point

AI Assurance Ecosystem: definition, components, and need for systematic implementation


Topics

Artificial intelligence | Financial mechanisms | Capacity development


N

Natasha Crampton

Speech speed

136 words per minute

Speech length

637 words

Speech time

279 seconds

Lifecycle Integration & Continuous Monitoring

Explanation

Natasha calls for treating assurance as shared infrastructure, embedding it throughout the system development lifecycle, making it interoperable, shared, and capable of real‑time monitoring and accountability, especially in the Global South.


Evidence

“One of the key things that I think we need to do as a community is really to treat assurance as infrastructure, infrastructure that we need to build together and put into practice together.” [9] “The first is that it’s really important that we build assurance into systems as part of the system development lifecycle.” [12] “Second, assurance has to be interoperable.” [14] “Third, assurance has to be shared.” [15] “When systems can plan … assurance really has to move towards continuous monitoring, real -time detection, and clear accountabilities for when interventions need to take place.” [37] “We need shared evaluation infrastructure, shared taxonomies, and shared investment in capacity, particularly in the global south.” [17] “It’s often strongest where there’s access to compute and data and evaluation infrastructure, and weakest where those things are scarce.” [18]


Major discussion point

AI Assurance Ecosystem: Lifecycle Integration


Topics

Artificial intelligence | Capacity development | Closing all digital divides


C

Chris Meserole

Speech speed

135 words per minute

Speech length

534 words

Speech time

236 seconds

Summative Call – Global Coordination & Evolving Assurance

Explanation

Chris urges the community to evolve and mature assurance practices for multi‑agent environments, to coordinate globally, and to engage with reports and working groups as a shared responsibility across sectors.


Evidence

“There’s a lot of reference to agents here, the kind of coming prospect of multi -agent environments as well.” [45] “We need from evals to mitigations, we need to have a better kind of an evolving understanding of how to do assurance.” [53] “One is that we need to evolve and mature our understanding of assurance.” [54] “I love the point about kind of assurance as a bottom -up effort, and I think it’s one that, you know, we all have a role to play here, regardless of which sector you are in, regardless of what aspect of assurance you’re taking part in, there’s a role for all of us.” [55]


Major discussion point

Commitments, Incentives, and Future Actions (next 12 months)


Topics

Artificial intelligence | The enabling environment for digital development


Agreements

Agreement points

AI assurance must be a collaborative, shared responsibility across all stakeholders

Speakers

– Natasha Crampton
– Madhu Srikumar
– Chris Meserole
– Stephanie Ifayemi

Arguments

Assurance requires shared evaluation infrastructure, taxonomies, and investment in capacity, particularly in Global South


No single organization can achieve proper AI assurance alone – it requires field-wide collaboration


Everyone has a role to play in making technology safe, secure and trusted regardless of sector


Changing incentives through insurance and professionalization of assurance providers are key to ecosystem maturity


Summary

All speakers emphasized that AI assurance cannot be achieved by individual entities working alone but requires coordinated collaboration across organizations, sectors, and stakeholders globally


Topics

Artificial intelligence | Building confidence and security in the use of ICTs | The enabling environment for digital development


AI assurance must evolve from theoretical frameworks to practical, operational implementation

Speakers

– Natasha Crampton
– Rebecca Finlay
– Frederic Werner

Arguments

AI assurance must evolve from theoretical exercise to operational discipline working across borders, languages and cultures


Partnership on AI convenes communities of both thinkers and doers to move from theory to practice


Standards are essential tools for turning ambitious governance principles into actionable details, especially for international business interactions


Summary

Speakers agreed that the field needs to move beyond conceptual discussions to practical implementation of AI assurance frameworks and standards


Topics

Artificial intelligence | The enabling environment for digital development | Building confidence and security in the use of ICTs


Agentic AI systems require new approaches to assurance due to their autonomous nature

Speakers

– Josephine Teo
– Owen Larter
– Natasha Crampton
– Stephanie Ifayemi

Arguments

Agentic AI offers transformative possibilities but introduces new risks due to increased autonomy and reduced human oversight


Agents represent autonomous systems that achieve goals rather than just following instructions, creating new economic possibilities


Post-deployment monitoring becomes more critical with agentic systems requiring continuous monitoring and real-time detection


Real-time failure detection and tiered assurance approaches are needed based on risk levels and agent autonomy


Summary

All speakers recognized that agentic AI systems fundamentally change the assurance landscape, requiring new monitoring approaches and risk management strategies due to their autonomous capabilities


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Global South participation and capacity building are essential for effective AI assurance

Speakers

– Vukosi Marivate
– Stephanie Ifayemi
– Frederic Werner
– Natasha Crampton

Arguments

Assurance frameworks designed in US, UK, or Singapore don’t translate well to contexts with different data, languages, and deployment conditions


Global South countries shouldn’t always play catch-up but should participate in forward-looking frameworks like agent standardization


AI can help remove connectivity friction through local language content and applications, but requires proper education and skilling


Assurance requires shared evaluation infrastructure, taxonomies, and investment in capacity, particularly in Global South


Summary

Speakers agreed that meaningful global AI assurance requires active participation from Global South countries and significant investment in building local capacity rather than simply adapting frameworks developed elsewhere


Topics

Artificial intelligence | Closing all digital divides | Capacity development


Similar viewpoints

Both emphasized the critical importance of security considerations and systematic approaches to testing and validation for agentic AI systems

Speakers

– Josephine Teo
– Owen Larter

Arguments

Agent security and new capabilities for misuse are primary safety concerns requiring superior security protocols


Three essential components needed: technical testing, standards development, and third-party assurance providers


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Both highlighted the need for AI assurance frameworks to account for regional differences in priorities, contexts, and user experiences rather than applying universal standards uniformly

Speakers

– Stephanie Ifayemi
– Vukosi Marivate

Arguments

Different regions prioritize different risks – Pacific Island nations focus more on environmental impacts than other regions


Test once, comply globally requires balancing high-level safety standards with localized individual user experiences


Topics

Artificial intelligence | Environmental impacts | Closing all digital divides


Both emphasized the need for sophisticated, integrated approaches to AI assurance that are built into systems from the ground up and adapt to different risk contexts

Speakers

– Natasha Crampton
– Stephanie Ifayemi

Arguments

Assurance must be built into system development lifecycle rather than bolted on at the end


Real-time failure detection and tiered assurance approaches are needed based on risk levels and agent autonomy


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Unexpected consensus

Market incentives for AI safety as competitive advantage

Speakers

– Josephine Teo
– Stephanie Ifayemi

Arguments

Companies demonstrating high safety assurance will gain competitive advantage in the market


Changing incentives through insurance and professionalization of assurance providers are key to ecosystem maturity


Explanation

It was unexpected to see strong consensus between a government minister and a civil society researcher on framing AI safety assurance as a market opportunity rather than just a regulatory burden, suggesting a mature understanding of how to align business incentives with safety goals


Topics

Artificial intelligence | The digital economy | Building confidence and security in the use of ICTs


Government leadership through high-risk AI deployment

Speakers

– Josephine Teo
– Frederic Werner

Arguments

Government must lead in testing agentic AI through high-risk applications like citizen services to establish credibility in governance


Multilateral institutions like ITU should focus on practical applications and inclusive participation to develop next-generation standards and policy recommendations


Explanation

There was unexpected alignment between a national government minister and an international organization representative on the need for public sector entities to lead by example in AI deployment rather than just regulate from the sidelines


Topics

Artificial intelligence | Social and economic development | The enabling environment for digital development


Overall assessment

Summary

The speakers demonstrated remarkably strong consensus across multiple dimensions of AI assurance, including the need for collaborative approaches, practical implementation over theoretical frameworks, specialized approaches for agentic systems, and inclusive global participation. There was particular alignment on the urgency of addressing AI assurance challenges while ensuring Global South participation and the importance of moving from reactive to proactive governance approaches.


Consensus level

High level of consensus with significant implications for AI governance. The agreement across diverse stakeholders (government, industry, civil society, international organizations) suggests a mature understanding of AI assurance challenges and creates a strong foundation for coordinated action. The consensus on treating assurance as shared infrastructure and the need for inclusive global approaches indicates potential for effective international cooperation on AI governance frameworks.


Differences

Different viewpoints

Approach to global AI assurance standardization

Speakers

– Vukosi Marivate
– Owen Larter
– Josephine Teo

Arguments

Test once, comply globally requires balancing high-level safety standards with localized individual user experiences


Access to models and compute-efficient systems are crucial for effective third-party evaluations globally


Three essential components needed: technical testing, standards development, and third-party assurance providers


Summary

Marivate argues that Singapore’s ‘test once, comply globally’ vision requires significant localization for individual user experiences and cannot be achieved through universal frameworks alone. Larter focuses on technical access and compute efficiency as the primary barriers. Teo presents a more centralized approach emphasizing standardized testing, universal standards, and third-party validation.


Topics

Artificial intelligence | Building confidence and security in the use of ICTs | Closing all digital divides


Priority focus for addressing AI assurance gaps

Speakers

– Stephanie Ifayemi
– Frederic Werner
– Vukosi Marivate

Arguments

Global assurance divide exists with six challenge areas: infrastructure, skills, languages, risk profiles, and access barriers


Standards are essential tools for turning ambitious governance principles into actionable details, especially for international business interactions


Assurance frameworks designed in US, UK, or Singapore don’t translate well to contexts with different data, languages, and deployment conditions


Summary

Ifayemi takes a comprehensive approach identifying multiple challenge areas requiring parallel attention. Werner prioritizes standards development as the key mechanism for implementation. Marivate emphasizes the fundamental incompatibility of Western-designed frameworks with Global South contexts, suggesting a more bottom-up approach is needed.


Topics

Artificial intelligence | Closing all digital divides | Capacity development | The enabling environment for digital development


Unexpected differences

Role of market incentives versus regulatory approaches

Speakers

– Josephine Teo
– Stephanie Ifayemi

Arguments

Companies demonstrating high safety assurance will gain competitive advantage in the market


Changing incentives through insurance and professionalization of assurance providers are key to ecosystem maturity


Explanation

While both speakers discuss incentive structures, Teo emphasizes market-driven competitive advantages for safety assurance, while Ifayemi focuses on systemic changes through insurance mechanisms and professionalization. This represents different philosophies about whether market forces or institutional frameworks should drive assurance adoption.


Topics

Artificial intelligence | Building confidence and security in the use of ICTs | The enabling environment for digital development


Timing and sequencing of assurance implementation

Speakers

– Natasha Crampton
– Vukosi Marivate

Arguments

Assurance must be built into system development lifecycle rather than bolted on at the end


Test once, comply globally requires balancing high-level safety standards with localized individual user experiences


Explanation

Crampton advocates for integrating assurance from the beginning of system development, while Marivate suggests that meaningful assurance requires post-deployment, localized testing closer to actual user experiences. This represents a fundamental disagreement about when and where effective assurance can be achieved.


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Overall assessment

Summary

The main areas of disagreement center on the feasibility and approach to global AI assurance standardization, with tensions between centralized versus localized approaches, technical versus capacity-building priorities, and market-driven versus regulatory mechanisms for incentivizing assurance adoption.


Disagreement level

Moderate disagreement with significant implications. While speakers share common goals of safe, trustworthy AI systems globally, their different approaches could lead to fragmented or incompatible assurance frameworks. The disagreements reflect deeper tensions between efficiency and inclusivity, standardization and contextualization, and top-down versus bottom-up governance approaches. These differences could impact the effectiveness of global AI assurance efforts and potentially exclude Global South perspectives from emerging standards.


Partial agreements

Partial agreements

All speakers agree that agentic AI systems require enhanced monitoring and assurance approaches due to their autonomous nature, but they disagree on implementation. Crampton emphasizes continuous monitoring and system design integration, Ifayemi focuses on tiered approaches based on risk levels, while Teo advocates for government leadership in testing high-risk applications first.

Speakers

– Natasha Crampton
– Stephanie Ifayemi
– Josephine Teo

Arguments

Post-deployment monitoring becomes more critical with agentic systems requiring continuous monitoring and real-time detection


Real-time failure detection and tiered assurance approaches are needed based on risk levels and agent autonomy


Agentic AI offers transformative possibilities but introduces new risks due to increased autonomy and reduced human oversight


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


All speakers agree on the importance of global inclusion in AI development and assurance, but propose different mechanisms. Larter focuses on technical access and efficient systems, Ifayemi emphasizes early participation in standard-setting, while Werner prioritizes education and capacity building as prerequisites for meaningful participation.

Speakers

– Owen Larter
– Stephanie Ifayemi
– Frederic Werner

Arguments

Access to models and compute-efficient systems are crucial for effective third-party evaluations globally


Global South countries shouldn’t always play catch-up but should participate in forward-looking frameworks like agent standardization


AI can help remove connectivity friction through local language content and applications, but requires proper education and skilling


Topics

Artificial intelligence | Closing all digital divides | Capacity development


Similar viewpoints

Both emphasized the critical importance of security considerations and systematic approaches to testing and validation for agentic AI systems

Speakers

– Josephine Teo
– Owen Larter

Arguments

Agent security and new capabilities for misuse are primary safety concerns requiring superior security protocols


Three essential components needed: technical testing, standards development, and third-party assurance providers


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Both highlighted the need for AI assurance frameworks to account for regional differences in priorities, contexts, and user experiences rather than applying universal standards uniformly

Speakers

– Stephanie Ifayemi
– Vukosi Marivate

Arguments

Different regions prioritize different risks – Pacific Island nations focus more on environmental impacts than other regions


Test once, comply globally requires balancing high-level safety standards with localized individual user experiences


Topics

Artificial intelligence | Environmental impacts | Closing all digital divides


Both emphasized the need for sophisticated, integrated approaches to AI assurance that are built into systems from the ground up and adapt to different risk contexts

Speakers

– Natasha Crampton
– Stephanie Ifayemi

Arguments

Assurance must be built into system development lifecycle rather than bolted on at the end


Real-time failure detection and tiered assurance approaches are needed based on risk levels and agent autonomy


Topics

Artificial intelligence | Building confidence and security in the use of ICTs


Takeaways

Key takeaways

AI assurance must evolve from theoretical frameworks to operational discipline that works across borders, languages, and cultures, especially for agentic AI systems


A significant global assurance divide exists, with frameworks designed in developed countries not translating well to Global South contexts due to different languages, data, infrastructure, and risk priorities


Agentic AI systems introduce new risks due to increased autonomy and reduced human oversight, requiring a shift from reactive regulation to proactive preparation


Three essential components are needed for robust AI assurance ecosystems: technical testing capabilities, standards development, and third-party assurance providers


Post-deployment monitoring becomes more critical with agentic systems, requiring continuous monitoring, real-time detection, and clear intervention accountabilities


No single organization can achieve proper AI assurance alone – it requires field-wide collaboration, shared infrastructure, and investment in Global South capacity


Companies demonstrating high safety assurance will gain competitive advantage, making assurance a strategic differentiator rather than compliance burden


Technical interoperability standards (like agent-to-agent protocols) are essential for the emerging agentic economy, similar to HTTP/URL standards for the internet


Resolutions and action items

Download and engage with PAI’s two new reports on strengthening AI assurance ecosystem and closing the global assurance divide


Build assurance into system development lifecycle rather than bolting it on at the end


Develop shared evaluation infrastructure, taxonomies, and capacity investment particularly in Global South


Create technical protocols for agent-to-agent communication and standardized information exchange


Establish tiered assurance approaches based on risk levels, use cases, and agent autonomy levels


Develop multilingual evaluations and benchmarks that account for diverse language ecosystems


Strengthen north-south collaboration to ensure Global South countries participate in forward-looking frameworks rather than playing catch-up


Change incentives through insurance mechanisms and professionalization of assurance providers


Treat assurance as shared infrastructure that needs collaborative building and implementation


Unresolved issues

How to balance high-level safety standards with localized individual user experiences in ‘test once, comply globally’ approaches


Specific mechanisms for ensuring Global South participation in emerging agent standardization efforts


How to prioritize limited resources between upstream infrastructure needs versus other foundational assurance tools


Technical challenges of continuous monitoring and real-time failure detection for complex multi-agent systems


How to effectively scale multilingual evaluations across thousands of languages and dialects


Determining appropriate levels of access to models for third-party evaluators while maintaining security


How to build sufficient local policymaker capacity and capability in Global South countries


Balancing compute-intensive assurance requirements with accessibility and cost constraints


Suggested compromises

Develop tiered assurance approaches that scale requirements based on risk levels, use cases, and consequences rather than one-size-fits-all solutions


Balance upstream infrastructure investment with parallel development of foundational documentation and evaluation tools


Create both high-level global safety standards and localized evaluation mechanisms that account for individual user contexts


Develop quick and cheap models alongside more compute-intensive ones to enable broader access to agentic systems and their evaluation


Focus on practical applications and learning-by-doing approaches rather than purely theoretical frameworks


Combine pre-deployment testing with enhanced post-deployment monitoring, recognizing the latter’s increased importance for agentic systems


Thought provoking comments

We need to shift from reactive regulation to proactive preparation… Government is high risk because the touch point with citizens are very sensitive. No citizen and no government wants to make serious mistakes when they interact with their citizens, telling them things about their health, telling things about their social security, telling them about things to do with their benefits that are not accurate, and having them not just being told but acted upon.

Speaker

Josephine Teo


Reason

This comment reframes the entire approach to AI governance from a defensive to an anticipatory stance, while highlighting the unique vulnerabilities of government-citizen interactions with AI agents. It introduces the concept that governments must be ‘leaders not laggards’ in testing AI systems precisely because the stakes are so high.


Impact

This shifted the discussion from theoretical frameworks to practical implementation challenges, establishing the foundation for subsequent speakers to discuss concrete assurance mechanisms. It also introduced the critical insight that high-stakes environments require more rigorous testing, which influenced later discussions about tiered assurance approaches.


I’ve yet to see high potential use case developed in Brussels work equally well in Johannesburg and Shenzhen and maybe Panama. Like, it’s just, we haven’t really reached that yet… if you don’t address that literacy piece, then it’s just going to be a crapshoot. We’re not sure [if AI deployment will go in the right direction].

Speaker

Frederic Werner


Reason

This comment challenges the assumption that AI solutions are universally transferable and introduces the sobering reality that good intentions don’t guarantee positive outcomes. It highlights the critical gap between AI development in major tech hubs and real-world deployment in diverse global contexts.


Impact

This comment fundamentally shifted the conversation from technical standards to contextual adaptation, setting up the entire framework for discussing the ‘global assurance divide.’ It directly influenced subsequent discussions about multilingual evaluations, local capacity building, and the need for bottom-up rather than top-down approaches.


When you think about the Pacific Island nations, for example, they would be thinking about assuring for environmental impacts differently than maybe environmental impacts would be considered as important in the US at the moment… it’s really interesting that I think in terms of closing the divide, it might the starting point or what you put emphasis on might vary.

Speaker

Stephanie Ifayemi


Reason

This insight reveals that the ‘global assurance divide’ isn’t just about capacity or resources—it’s about fundamentally different risk priorities and values across regions. It challenges the assumption that assurance frameworks can be universally applied without considering local priorities and contexts.


Impact

This comment deepened the conversation by moving beyond technical and resource gaps to cultural and contextual differences in risk assessment. It influenced the discussion toward more nuanced approaches to assurance that account for local values and priorities, rather than one-size-fits-all solutions.


It will not be top-down. I don’t believe that. It will be them understanding whether it’s labor laws, it’s data governance, it’s just monitoring of systems once they’re on. If there is not that capacity or capability to actually do those things, again, it’s more automated direction that is not necessarily what the values of those people actually are.

Speaker

Vukosi Marivate


Reason

This comment challenges the prevailing assumption that AI governance frameworks can be effectively imposed from global institutions or developed countries. It emphasizes that without local capacity and understanding, AI systems may operate counter to local values and needs.


Impact

This fundamentally reoriented the discussion toward bottom-up capacity building and local ownership of AI governance. It influenced subsequent conversations about the need for decentralized assurance approaches and the importance of building local expertise rather than relying solely on global standards.


Post-deployment testing in an agentic world takes on an even greater level of importance… When systems can plan and they can chain actions, they can interact with tools, they can adapt over time, assurance really has to move towards continuous monitoring, real-time detection, and clear accountabilities for when interventions need to take place.

Speaker

Natasha Crampton


Reason

This comment identifies a fundamental shift in how assurance must evolve for agentic AI systems. Unlike traditional AI that produces outputs, agents take actions, requiring a completely different approach to safety and monitoring that emphasizes continuous rather than pre-deployment testing.


Impact

This insight crystallized the technical and governance challenges specific to AI agents that had been building throughout the discussion. It provided a clear framework for understanding why existing assurance approaches may be insufficient and influenced the final call to action about treating assurance as shared infrastructure.


Overall assessment

These key comments collectively transformed the discussion from a theoretical exploration of AI assurance to a practical, globally-conscious examination of implementation challenges. The conversation evolved through three critical phases: first, establishing the urgency and stakes of proactive AI governance; second, revealing the fundamental limitations of current approaches when applied globally; and third, articulating the need for new paradigms that account for local contexts, continuous monitoring, and shared responsibility. The most impactful insight was the recognition that effective AI assurance cannot be achieved through universal frameworks imposed from major tech centers, but requires bottom-up capacity building, cultural sensitivity, and adaptive approaches that reflect local values and priorities. This realization fundamentally shifted the conversation from ‘how do we scale existing assurance methods’ to ‘how do we build inclusive, locally-relevant assurance ecosystems that can evolve with rapidly advancing AI capabilities.’


Follow-up questions

How do we close the divide that exists between countries in the Global South versus others in AI assurance?

Speaker

Rebecca Finlay


Explanation

This addresses fundamental inequities in AI assurance capabilities and access globally, which is critical for inclusive AI development


What does it mean to do AI assurance globally around the world?

Speaker

Rebecca Finlay


Explanation

Understanding how to implement AI assurance across different contexts, cultures, and regulatory environments is essential for global AI governance


How do we demonstrate that the risks have been managed well in agentic AI systems?

Speaker

Josephine Teo


Explanation

This is fundamental to building trust and confidence in autonomous AI systems that can take actions without human oversight


What would be the components that lead to an assurance ecosystem system that would be robust enough for AI, and specifically agentic AI?

Speaker

Josephine Teo


Explanation

Identifying the essential building blocks for AI assurance infrastructure is crucial for systematic implementation


How do we turn those ambitious words and principles into actions in AI governance frameworks?

Speaker

Frederic Werner


Explanation

Bridging the gap between policy intentions and practical implementation is a persistent challenge in AI governance


How can you bake in common sense principles (trustworthy, verifiable, secure, safe, human rights-based, inclusive) when developing AI standards?

Speaker

Frederic Werner


Explanation

Ensuring fundamental values are embedded in technical standards rather than added as afterthoughts is critical for responsible AI


How do we ensure that if AI helps people leapfrog infrastructure in the Global South, it goes in the right direction?

Speaker

Frederic Werner


Explanation

Preventing unintended negative consequences when AI is rapidly adopted in developing regions is important for equitable development


How do we create superior security protocols for agentic systems connected to different accounts and services?

Speaker

Owen Larter


Explanation

Security becomes more complex when autonomous systems have access to sensitive personal and financial information


How do we balance assurance needs and prioritize across the AI value chain, especially regarding upstream infrastructure versus other foundational tools?

Speaker

Stephanie Ifayemi


Explanation

Resource allocation decisions in AI assurance require strategic thinking about where to invest limited resources for maximum impact


How do we ensure that Global South countries aren’t missing from agent standardization work and aren’t always playing catch up?

Speaker

Stephanie Ifayemi


Explanation

Inclusive participation in setting AI standards is essential to prevent further marginalization of developing countries


How do you trust the assurer? What does accreditation look like for assurance organizations or individuals?

Speaker

Stephanie Ifayemi


Explanation

Establishing credibility and standards for those conducting AI assurance is fundamental to the integrity of the entire system


How do you design an assurance ecosystem with different risk levels, reversibility considerations, and agent autonomy levels in mind using a tiered approach?

Speaker

Stephanie Ifayemi


Explanation

Creating flexible assurance frameworks that can adapt to different risk contexts and use cases is necessary for practical implementation


How do we build assurance into systems as part of the system development lifecycle rather than bolting it on at the end?

Speaker

Natasha Crampton


Explanation

Integrating assurance from the design phase is more effective than retrofitting it, but requires new development practices


How do we make assurance interoperable across regions while adaptable to local languages, cultures, and deployment realities?

Speaker

Natasha Crampton


Explanation

Balancing global standards with local adaptation is a key challenge for scalable AI assurance


Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.