Advancing Scientific AI with Safety Ethics and Responsibility
20 Feb 2026 11:00h - 12:00h
Advancing Scientific AI with Safety Ethics and Responsibility
Summary
The panel examined how the rapid emergence of AI-driven biodesign tools is reshaping bio-security governance, moving risk from physical labs to the design stage and demanding new oversight mechanisms [7-13]. Participants stressed that data governance, model evaluation and red-team exercises remain essential components of this response [14-15].
Speakers argued that a single central authority in Delhi would be ineffective and called for decentralized, institution-level checks such as empowering biosafety officers and creating adaptive, non-periodic oversight mechanisms [24-28][31]. They noted that traditional paper-based facility inspections are outdated for AI-enabled research and that adaptive, rapid review processes are needed [133-138]. The discussion highlighted the uneven resources across Indian research institutions and the necessity of training more personnel in chemical, AI, and nuclear security to match the vibrant but heterogeneous scientific ecosystem [15-18][122-130][133-140].
In response to concerns about open science, the panel advocated a tiered-access model combined with pre-deployment assessments and “know-your-customer” style vetting, arguing that blanket restrictions would stifle innovation while differentiated capability-level governance could mitigate misuse [41-48][49-57]. They emphasized that open-source tools are critical for innovation in low-resource settings and should not be conflated with danger [54-57]. RAND Europe’s global risk index and structured rubrics for evaluating frontier models before release were cited as useful tools [43-45][106-108].
The discussion highlighted that many Southeast Asian countries lack AI readiness and that safety evaluations must incorporate sociocultural contexts, small-model edge deployments, and accountability frameworks, with self-regulation complementing formal standards [66-71][75-79]. Capacity-building measures such as AI literacy and awareness programs for marginalized communities were identified as necessary to close gaps [175-176]. Proposals included regular six-monthly independent monitoring, an AI safety institute linked to governments, and shared incident-reporting mechanisms with tiered confidentiality to enable coordinated global oversight [105-119][161-170][226-235].
Overall, the panel concluded that effective AI-biosecurity governance will require decentralized yet integrated oversight, capacity building in the global south, socio-technical risk assessment, and harmonized data and legal frameworks to ensure both scientific progress and safety, with coordinated agency and cross-border collaboration to avoid fragmentation [24-31][122-154][161-183][209-242][255-263].
Keypoints
Major discussion points
– AI is moving bio-risk upstream, demanding new governance structures.
The rapid rise of AI-enabled biodesign tools decouples risky capabilities from traditional physical containment, shifting the threat to the design stage of biology [6-13]. This calls for “more decentralized checks and balances” and empowerment of existing biosafety and information-security offices, rather than a single central authority [23-28][24-27].
– Balancing open-science benefits with controlled access through tiered, capability-based governance.
Participants argue for “tiered access and contextual norms” and stress the importance of pre-deployment assessments with structured rubrics [40-48]. They warn against conflating open-source with danger, advocating differentiated governance at the capability level rather than blanket restrictions [55-58].
– Context-specific capacity building and socio-technical evaluation for the Global South.
The panel highlights gaps in AI readiness, the need for small-model, edge-focused solutions, and participatory risk assessments that reflect local socio-cultural realities [62-71][73-78]. Self-regulation and unified, adaptable frameworks are seen as essential for countries like India and other low-resource settings [78-79].
– Institutionalizing independent evaluation, red-teamings, and continuous monitoring.
A six-monthly “ritual” of risk assessment, possibly run by an AI-safety institute with formal government links, is proposed to embed systematic oversight [105-112][113-119]. Such mechanisms would require significant multilateral investment and coordination with bodies like the WHO or the Biological Weapons Convention [116-118].
– Ensuring interoperable, cross-border biosurveillance and data-sharing.
The discussion stresses the current fragmentation of data standards and legal regimes, recommending harmonised standards (e.g., federated HL7-FHIR-like frameworks), pre-negotiated legal safe-harbors, and shared evaluation criteria to enable coordinated pandemic-response and bio-risk monitoring [212-224][226-235].
Overall purpose / goal of the discussion
The panel aimed to map the emerging security and governance challenges posed by AI tools that can design biological agents, and to explore practical policy, technical, and institutional pathways-ranging from decentralized oversight to international data-standard harmonisation-that can preserve the benefits of open scientific innovation while preventing misuse.
Overall tone and its evolution
The conversation maintained a professional, solution-oriented tone throughout. It began with a broad framing question, moved into a diagnostic phase highlighting risks and structural gaps, then shifted to constructive proposals and concrete action items. While the urgency of the bio-security threat was repeatedly underscored, the tone remained collaborative rather than alarmist, ending on a forward-looking note emphasizing coordination and capacity-building.
Speakers
– Moderator
– Role/Title: Conference Moderator (session moderator) [S15]
– Area of Expertise: Session moderation
– Speaker 1
– Role/Title: (not specified in external sources)
– Area of Expertise: Biosecurity, AI-enabled biodesign, risk governance (as discussed in the transcript)
– Speaker 2
– Role/Title: (not specified in external sources)
– Area of Expertise: AI safety and security governance, independent evaluation and red-teaming of AI systems (as discussed in the transcript)
– Speaker 3
– Role/Title: (not specified in external sources)
– Area of Expertise: AI policy, socio-technical assessment, AI readiness for the Global South (as discussed in the transcript)
– Audience Member 1
– Role/Title: Founder of Corral Inc [S3]
– Area of Expertise: (not specified)
– Audience Member 2
– Role/Title: Participant from a German group [S18]
– Area of Expertise: (not specified)
– Audience Member 3
– Role/Title: (not specified)
– Area of Expertise: (not specified)
Additional speakers:
– Justin – mentioned only by name in the closing remarks; no role, title, or expertise provided.
Opening framing – The moderator began by asking whether the challenges of AI-enabled biodesign should be framed primarily as a data-governance problem, a model-design issue, or a verification-and-compliance matter [1].
Speaker 1 (biosecurity perspective) – He clarified that he is a bio-security specialist, not an AI-safety expert, and described a deep structural change in the life-sciences: risk governance has traditionally relied on physical controls such as lab inspections and material-transfer agreements [7-8]. The rapid emergence of more than 1 500 AI-enabled biodesign tools-from protein-engineering to pathogen-host interaction modelling-has begun to decouple risky capabilities from those physical containment measures, moving the risk “upstream” to the design phase [9-13]. While data governance, model evaluation and red-team exercises remain essential [14-15], they must be complemented by new upstream mechanisms. He called for training personnel in chemical, AI and nuclear security [19-22] and for empowering information-security and biosafety offices to handle emerging AI risks [23-28][31]. Rather than a single central authority in Delhi, he advocated adaptive, decentralized oversight that goes beyond periodic, paper-based inspections [23-28][31]. He also proposed a tiered risk-classification scheme for AI-enabled biodesign tools, with higher scrutiny for virus-focused models [122-156]. Two additional points were made in his closing remarks: (i) the digital-to-physical barrier-even freely available AI designs require physical infrastructure to become actual pathogens, preserving a control point [250-255]; and (ii) CEPI’s agentic-AI platform is already being used to detect jailbreak attempts and to accelerate vaccine development [300-310].
Open-science discussion (Speaker 2) – Responding to the moderator’s second question, he said a binary answer was impossible and advocated a “tiered-access and contextual-norms” approach [41-42]. He praised RAND Europe’s global risk index and its structured pre-deployment assessment rubrics, likening them to “know-your-customer” (KYC) procedures that can credential researchers for defensive work while preserving open-source innovation for low-resource settings [43-57]. He stressed that blanket restrictions would stifle innovation; instead, differentiated, capability-level governance could mitigate misuse without conflating open-source tools with danger [54-58]. He also warned that once frontier models are released the danger is “already out there” and cannot be easily withdrawn [84-89][105-112].
Institutional gaps (Speaker 3, Geeta) – Asked to identify the most immediate gaps, she noted that AI-readiness varies dramatically across the Global South: India ranks third globally, but many Southeast Asian nations lag far behind [62-64]. Large-language models are typically trained on Western data, and existing safety benchmarks show a 20-30 % failure rate in biological risk assessments [65-71][73-78]. She called for socio-cultural evaluations, small-model edge deployments, and participatory risk-assessment processes involving end-users [65-78]. India’s policy of voluntary self-regulation was highlighted, and she urged a unified yet adaptable framework that can be tailored to diverse deployment environments [78-79].
Independent evaluation norm (Speaker 2) – In response to whether independent evaluation and red-team testing should become a global norm, he drew an analogy to nuclear oversight (IAEA) and noted that biology is highly diffused, making traceability difficult [84-89]. Citing a recent SECURE-Bio study in which a frontier LLM (ChatGPT-3) outperformed expert virologists on wet-lab troubleshooting [100-104], he proposed a systematic six-monthly monitoring ritual conducted by a credentialed, independent AI-safety institute with formal government links [84-89][105-112]. He clarified that the institute would anchor its work to the Biological Weapons Convention or the WHO, even though the relationship is not yet fully established [113-119][116-118].
Feasibility in heterogeneous ecosystems (Speaker 1) – He described the “wide heterogeneity” of Indian institutions, ranging from well-resourced labs to under-funded centres [122-131]. Traditional periodic, paper-based inspections are outdated; instead, rapid, adaptive review processes are required [133-138]. He called for upstream safeguards, cross-trained AI-biosafety review panels, and investment in domestic evaluation capacity such as the AI safety institute at IIT Madras [147-154]. Leveraging tech-sovereignty measures to control the import and deployment of critical AI models was also recommended [155-156].
Emerging scientific powers shaping governance (Speaker 3) – Geeta explained that India is already creating “sandboxes” for health and ideological AI systems and that a forthcoming Global-South network for trustworthy AI and an AI-safety commons will enable low-resource countries to share tools, benchmarks and best practices [161-166]. She described an incident-reporting framework tailored to Indian contexts, capturing a taxonomy of harms-including physical, psychological, cyber, socio-economic and environmental impacts-and supporting capacity-building programmes for healthcare workers [169-176][267-274]. These initiatives are complemented by collaborative multi-stakeholder efforts and the recently published AI-governance guidelines from METI [178-182][260-262].
Scope of safety evaluation (Speaker 1) – He broadened the discussion from model-centric assessment to a full socio-technical appraisal [189-203]. Key considerations include capability uplift relative to governmental capacity, incentive structures, cross-border diffusion of risk, and the digital-to-physical barrier that still limits the translation of malicious code into real pathogens [194-201][250-255]. He warned that without integrating AI evaluation into existing biosafety and resource-security systems, audits would merely scrutinise algorithms while ignoring the institutions that operationalise them [200-202].
Avoiding fragmentation (Speaker 2) – He highlighted that many countries are deploying AI-driven biosurveillance (syndromic, genomic sequencing, outbreak modelling) on incompatible data standards and legal regimes, leading to dangerous data-hoarding-as observed during the COVID-19 pandemic [212-224]. He proposed three remedies: (i) harmonising data standards through a federated HL7-FHIR-like framework for public-health surveillance; (ii) establishing pre-negotiated legal safe-harbours for cross-border data sharing during emergencies; and (iii) agreeing on shared evaluation criteria that can be embedded in national surveillance systems [226-235][230-236]. He also noted the siloing between AI-governance and bio-security communities, which creates a “gap where the risk happens” [237-241].
Closing remarks (moderator) – The panel’s key points were summarised: safety evaluation is systemic; incident-response mechanisms and cross-border solutions are needed; and a balance must be struck between open-source innovation and managed access [255-263].
Audience Q&A –
* Harms taxonomy: a researcher asked to expand the definition of harms beyond physical injury; Geeta’s team explained that their incident-reporting framework already categorises physical, psychological, cyber, socio-economic and environmental harms and that they are developing toolkits to assess healthcare-worker perceptions of AI [264-274][267-274].
* Model drift: a participant raised temporal model drift; Geeta responded that monitoring data-distribution drift is part of the system-monitoring approach and a key safety criterion [286-288].
* Web of prevention: Speaker 1 advocated a decentralized yet integrated leadership structure that empowers biosafety officers and provides a top-level reporting channel [294-299]; Speaker 2 illustrated Singapore’s multi-agency coordination model (NEA, MOH, Communicable Disease Agency, PREPARE) as a concrete example of an effective “web of prevention” [300-313].
Consensus & tensions – The panel agreed on the necessity of decentralized, capability-based governance; the importance of pre-deployment assessments combined with continuous AI-driven monitoring; the urgency of capacity-building and tech-sovereignty measures in the Global South; and the need for harmonised data standards and legal safe-harbours to avoid fragmentation. Divergence remained on the optimal locus of oversight-whether a fully decentralized network of local checks or a centrally-linked AI-safety institute is preferable-and on the degree to which open-source tools should be subject to tiered access controls. These tensions point to hybrid models that blend bottom-up empowerment with top-down coordination, and to further work on funding mechanisms, operational designs for six-monthly monitoring, and concrete protocols for DIY and small-scale commercial bio-AI activities.
Action items – (i) launch the Global-South trustworthy-AI network and an AI-safety commons; (ii) adopt tiered, capability-level access and pre-deployment rubrics for high-risk biodesign tools; (iii) embed AI safety checks into grant-review and institutional-review processes; (iv) establish a six-monthly independent monitoring regime via a credentialed AI-safety institute linked to the BWC/WHO; (v) develop a tiered risk-classification scheme for biodesign tools; (vi) create federated data-standard frameworks (e.g., HL7-FHIR-adapted) and pre-negotiated legal safe-harbours for emergency data sharing; (vii) roll out an incident-reporting taxonomy covering the full spectrum of harms; and (viii) invest in capacity-building programmes for biosafety officers, AI-safety personnel and tech-sovereignty measures. Unresolved issues include the precise governance and funding structures for the proposed institute, operationalising tiered access without stifling legitimate open-source research, and scaling continuous model-drift detection in low-resource settings. Addressing these will be essential for a resilient, inclusive governance regime that safeguards both scientific progress and global bio-security.
Key area should we think about it as a governance data governance problem, problem in model design or should it be more on a verification or compliance angle.
Thanks thank you very much Shyam for having me and good morning to everyone and welcome to this session. So I think okay let me maybe just start with saying that I’m not an AI or AI safety expert so whatever I say take it with a pinch of salt. My work is in biosecurity and that’s the angle I’ll come from. I think all of those things whether it’s a model evaluation and other things those are there and those are very very important factors and that those are the things that we need to keep in mind. But on top of that there is also a very important deep structural change that is happening. For example in the field of life sciences historically whatever risk and risk governance things that we had were very much linked to the physical infrastructure and lab facilities and facility inspection and material transfer control and things like that.
But that seems to have changed and seems to be changing very rapidly now with the kind of AI biodesign tools as well as LLMs that are emerging. So I think Rand also did a study on this, but there are more than probably 1 ,500 biodesign tools that are out there, and those are totally transforming how life sciences, but in general, science is done. Now, what kind of change that we are seeing is with these capabilities, now it’s much easier to engineer proteins, optimize DNA sequences to do things that we want, have better pathogen host modeling, interaction modeling, and things like that. Now, these capabilities are… because of AI becoming partly decoupled from the physical containment measures which were usually used in the life sciences.
So we have a lot of this risk landscape shifting a little bit more upstream to the design side when it comes to at least biological side of things. So yes, data governance, things matter. Model evaluation and red teamings are essential and we should be doing that. But also it is very important that especially for a country like India where we have a very vibrant scientific ecosystem but that is also very uneven. How we can use this AI -enabled science which is rapidly evolving into the existing mechanisms to some extent but also at the same time develop those capabilities, have more people with the core capabilities and more people with the core capabilities and more people with the core capabilities chemical security, AI nuclear security, and things like that.
So we need to train more people on those things. So integrating, again, going back to the life sciences, so integrating AI evaluation into biosafety system, strengthening the institutional readiness. Some places there are information, some labs and some institutions have information security labs or information security offices. How we can get them better prepared for these new emerging risks that are coming due to AI. Some places they have biosafety officers or biosecurity officers. How we can enable them better to address the AI risk is what the direction that we need to move towards. And have more adaptive oversight mechanism that is not only based on the, limited to this once in a while inspection that happens, but that goes more with the rapidly evolving things that we are seeing with the AI models coming up.
And I think, I think, So, just in terms of paradigm change that we are seeing and that you mentioned, is that there need to be more decentralized checks and balances and oversight mechanisms. If there is one authority sitting somewhere in Delhi and trying to do everything, that’s not going to work. So that is one of the things that we have to collectively think about. How do we decentralize these kind of oversight systems to some extent? For example, as I was saying, how we can empower the information security or biosecurity offices and create what in the field of disarmament where I have worked on called way of prevention. One measure is not enough. It’s not sufficient.
You need to have a number of measures in place which collectively can help prevent something bad from happening. Thank you.
Thank you. That’s very insightful. And I think we’ve already touched on some areas that, you know, that would be follow -up questions. P .T., focusing a bit more on… open science where high risk domains, especially in biological data and AI capabilities, as Surya was mentioning. How do we preserve the benefits of open science while preventing the destabilizing diffusion of capabilities that we were just discussing about?
Thank you. Thank you for having me today. So I guess like I would love to be able to give like a binary yes or no answer. Right. I think we all want to have that. But unfortunately, that’s not quite the case. So we need to find a way to balance the openness and also the restrictions as well. So I guess my answer here would be sort of like a tiered access and contextual norms. I think those are really important. And I think RAN Europe has done a really great job at establishing the global risk index on AI enabled biological tools. And also just generally looking into AI safety in general, where they do this thing where they call the pre -deployment assessment.
with structured rubrics. And I’m a huge fan of that because I think that when you release very frontier models and frontier tools, the danger is already out there once released. It’s really hard to withdraw the danger. But however, prevention, right? There’s this window before you release where you can do a pre -deployment assessment. So I think I’m a really huge fan of that and also the same way that I’m a big fan of KYC, know your customers. And I guess this principle also pretty much applies whereas in the case of biosecurity, where we differentially allow the development of medical countermeasures and also the defensive measures that is necessary for the research, but also don’t limit the researchers from actually innovating either.
And I guess my point here is that we’re not going to be able to do that. The non -safeguarded access, like private access to credential researchers where necessary for like defensive research is absolutely necessary. And then, you know, like open source tools, it’s necessary. Like we can’t turn away from being open source. Like any governance structure that conflates open source with danger makes a huge mistake because that also is a very critical development point, especially for lower resource settings. So we cannot afford to conflate that altogether. So I guess a very long way to answer this and then to summarize my answer is that differentiated governance at capability level is always better than blanket restriction at access level.
Yeah.
I think that’s a very structured answer and I think, you know, there’s a start of a very valid framework level conversation that’s already happening there. Geetha, turning to you, thinking more about institutional gaps in enabling some of the solutions that we are discussing, potential solutions, what are the most immediate gaps that you see in evaluating systems, technical capability, regulatory and coordination, largely from the policy angle that you work in?
Thank you, Shyam. Good morning, everyone. So on the technical capabilities, right, the most fundamental thing I see is the AI readiness aspect of deployment. So in general, when we see India stands or ranks third globally, and when we see the Southeast Asian countries, I think Indonesia is around 49, and so there we see the gap, right? So whatever we do from the Western context or in the Indian context can never be catered to the AI readiness aspect of deployment. So I think it’s important to the needs, the unique needs of the Southeast. Asian countries and moreover what there is the end user perception where we see that we have to build lot of capacity for creating awareness among the end users who are actually going to use the products and from the policy perspective I would like to give you certain aspects where we think about the socio -cultural aspects that is relevant to the deployment environments.
So in general the large language models are usually trained on the western data and the very recent research work maybe I will cover a bit of both tech and policy here. So there is a Southeast Asia related benchmark, safety benchmark which says that all these leading large language models have failed when they evaluated for more than 20 to 30 percent of the risk. So in the biological settings so which means that we did not have enough safeguards which will protect people from encountering all these risks. And moreover, so this lets us know that we have to build in more sociocultural evaluations and assessments which will cater to the harms that is more particular to that particular deployment environment rather than just having a high level evaluation strategies.
And this cannot come just from the policy side, right? So we need to bring in all the participatory approach which will bring in the end users, the different stakeholders involved in using all these AI systems, be it model, right from the requirements definition, right? So when we assess whether we need an AI system or not, generally now there is a perception saying that for whatever we are going to build or the problem that we are going to solve, by default we assume. We assume that we need a large language model which will not care. which is not even possible to have it deployed in a low resource setting, right? So we need to think about small language models which will enable edge deployments at the low resource settings and also consider all the multicultural and socio -economic diversity that exist in these regions so that your model doesn’t hallucinate, is still fair and also establish some governance and accountability frameworks which will make the developers more accountable and also because having the developers more accountable will enable them considering more safeguards, right?
And also create more awareness about the main fundamental thing is that they will be expected to document whatever testing that has been gone through. And on the policy side, there is one more aspect which is the Indian government also endorses, right? The self -regulation. voluntary commitments on managing and mitigating risk that comes out of all these AI models. So I think we have to have a unified framework which can still be adaptable to different deployment settings.
I think we are already getting a diversity of perspectives here and it is very useful to hear. Moving ahead and thinking about institutionalizing these kind of capabilities in scientific AI context, PT turning to you. Should independent evaluation and red teaming of AI systems from a technical kind of solution perspective for this problem that generate biological outputs, especially thinking biosecurity, given your perspective on this, should it become a norm and part of the global scientific specialist infrastructure? And if so, how would we go about that?
I think we have to have a clear understanding of the role of the AI system and how it can And I think that is a key point. And I think that is a key point. And I think that is a key point. And I think that is a key point. And I think that is a key point. And I think that is a key point. And I think that is a key point. And I think that is a key point. So I guess a good example to use here is probably we’re thinking of nuclear weapons, right? Which falls under this organization called the International Atomic Energy Agency, the IAEA. Now, from my perspective, I think fissile materials, correct me if I’m wrong, they’re very scarce.
And they are, to a certain degree, technically trackable. And they are also, more than anything else, highly regulated. Whereas biology, on the other hand, is everything but that. It’s diffused, it’s dual -use by nature, and it’s also nearly impossible to trace. And also, most importantly, commercially available, right? And so in the recent study, actually, this was done by this organization called SECURE. Bio, where they actually tested frontier large language models against expert virologists. And it turns out that ChachiPT -03 actually outperformed expert virologists by 94 % at troubleshooting wet lab protocols. So that’s a very shocking number, right? And then, I mean, obviously you mentioned earlier that there’s a very concentrated effort that is happening between the US, UK, and China, like the global superpowers, basically.
And I guess there’s, we, in the recommendation from the RAND Europe that I was, you know, helping out with is that we recommended that governments and also independent researchers do this six -monthly ritual of monitoring and also assessment of risk on a continuous basis. And we also suggested, obviously, like using AI as an automation tool to increase the efficiency of this risk monitoring system. But I think, to your point, I think stuff like this, stuff like that is non -interactive methodology that doesn’t require, you know, researchers to actually query directly with the danger systems is actually already in and of itself a very meaningful, you know, safeguard. But that is not enough. You know, we need something that is much larger than that.
That is the integration into, like, you know, institutionalizing it. And I would argue that, like, a six -monthly, you know, ritual, that refresh cadence, for it to be delivered, it’s going to require a very significant investment from the government at multilateral level, right? And so we can’t go without any investment at all. So my suggestion would be to actually implement this AI safety or security institute model that we’ve been applying where largely… It is technically credentialed. It’s independent, but also has a very… formal relationship with the government. And something that I would caveat from the bio side is that for the institution to have some kind of anchoring around biological weapons convention or the WHO.
Because right now that relationship is not quite there yet. And I think, you know, back to my point of like pre -deployment assessment, I think that is definitely needed and then the result has to be shared then across the credential network with tiered confidentiality that rather than being kept, you know, as a proprietary to the different state. I think it’s kind of a
That’s an interesting position, PT. Suryesh, thinking more about safety measures at large, how can we make sure that they remain rigorous and feasible within research ecosystems that you’re quite familiar with, you know, from a biosecurity angle, if you will, but largely also in the larger scientific ecosystem.
Thanks Shyam. I think first, yeah first thing that we need to understand is how that ecosystem is and then see if certain measures will work there or not, right. One of the hallmarks of let’s say Indian scientific ecosystem is there is a lot of heterogeneity. There are some places which are really extremely well performing and there are other places who are not well resourced or have other all kind of challenges. So, understanding how the ecosystem is, what kind of regulation within the institutes that are there, what kind of administrative measures that are there, what kind of safety teams these kind of institutes might have, all of those things are extremely important, right. The governance capacity, compliance culture and technical expertise varies widely in Indian institutions.
And I believe this is true for many other countries in the global south as well. So it’s not something very unique. Particularly to India, we have challenges related to different kind of resources. And even when the resources are there, sometimes it’s also problematic to use them efficiently enough. Now, given that context, if we just import safety frameworks that are developed in a well -resourced place in a Western country or any developed countries, I don’t know if those would be a very good fit for the kind of system that we have here. So those might become more performative than functional to some extent. Another challenge that also P .T. mentioned to some extent is that the speed and scale of AI is huge, right?
And we need these traditional review mechanisms that institutes have for safety audits and all of those things are not going to work. We need something which is far more adaptive and quick. And also what we had traditionally is this periodic paper -based facility -centric kind of measures. And those are very much outdated in the era of AI that we live in. Now, so what… Now the question becomes, how do we design proportionate capability -aware safeguards that would be better matched for the challenges that we have? One of the major challenges, as I think a lot of us realize, is that there is limited awareness about AI safety when it comes to scientific issues, even among the scientists.
So a huge number, a large majority of scientists just don’t know what they are putting, let’s say in chat GPT might be harmful or what they are getting out of biodesign tools could be harmful to some extent. So there is some understanding about the privacy -related issues, but safety and security is still a big gap in understanding of even the scientific experts that are there. Now also regarding AI, I think there needs to be a tiered risk classification. So not everything is highly risky. There are certain biodesign tools, for example, that are trained in… in virus data. Those we’ll put in a higher risk category compared to something which is just working, let’s say, on certain animals which are not dangerous.
Now, also the safety measures, as I was mentioning earlier, as the risk has moved a bit upstream, it has come more on the design side, we should also have more safety measures moving upstream. And as Piti was mentioning that, you know, certain kind of evaluation that are before launching AI tools are necessary, but also integrating AI evaluation modules into grant review processes, creating cross -trained AI biosafety review panels, so panels specifically for AI biosafety at, from the bottom -up side, instead of having them from the top -down approach. Investing more in domestic evaluation capacity, having more AI safety institutes like Geeta’s home institute at IIT Madras. So we need a lot more of that. And lastly, I think what we have in the US and UK are these, a lot of AI safety work is being done there, right?
And as I was mentioning, importing that directly might not work. And we in the global south are largely the users and importer of this technology. So we have to see from the bottom up side, where do we put those safety measures? Do we, like when it comes to import, what kind of, when the data is being transferred, is there certain places where we can put those kind of safeguards? Also, how we can use some tech sovereignty measures in this context, right? That tech sovereignty measures are used for a number of things, but AI security is something, AI safety and security is something where those could also be used to some extent. So, yeah, I would stop here and then we can discuss.
Thank you.
Thank you. And I think a lot of useful thoughts here for us to explore a bit more. I think we’ve… just crossed the mid mark and I’m going to use Geeta to kind of like bridge between the next two topics by combining two of your questions sorry for that so just as Surya just mentioned will the emerging scientific powers you know global south middle powers would they be able to shape governance in this context especially you know enable science or will they continue to inherit the frameworks and if they were to show leadership what would that look like in scientific AI and research ecosystems and you know you’ve already been working on some of this so I’m looking forward to kind of hearing concrete measures that you know are happening
Sure. So in general what I think is definitely the emerging powers right they are putting on all efforts to bring in all the tools and frameworks that are required for governing these AI systems and for example, so India’s strategy towards all these emerging techs is that they are trying to create sandboxes which are highly essential for deploying or evaluating safety aspects for the models, right? So they do it for healthcare systems, they do it for ideology systems and whatever, right? So these type of tools and frameworks come from Indian settings will actually help the other underdeveloped countries to learn from the strategies that we use and then build something of their own or something which cannot go cross border can still happen through learning and collaboration, right?
So for example, we are going to launch a global south network for trustworthy AI and we are going to launch a global south network for trustworthy AI and we are going to launch a global south network for trustworthy AI which will enable all these mechanisms to happen, enable people to… develop and deploy AI systems which will be deployed in the low resource settings. And the other initiative which is going to give a very big leap in evaluating AI safety is coming up with an AI safety commons for the global south. That is part of the safe and trusted AI pillar that is one of the pillars in this impact summit and I think in another one or two years we will have safety commons which will help us evaluate and assess how these AI data models and systems work for different deployment settings.
Another important thing is that as Suresh mentioned about the audit frameworks. So when we come with, when we focus on the kind of risk and audit mechanisms that we have here, we still have it from an organization perspective and not from the end user perspective. So at CRI, we have come up with an incident reporting mechanism and a framework that caters to the Indian settings. So it tells you how to operationalize incident, AI incident reporting in the Indian settings, which is completely different from the Western settings. And here we have to get the harms that the people experience in the marginalized communities, which will never be recorded everywhere, right? So how do we enable all these things?
So since it is all about all these CERN -based systems, right, even those things will have certain impacts to the marginalized communities, which may be an indirect impact. But how do they are knowing about such things are happening to them, right? So those kinds of gaps we should mitigate by building more awareness, creating more AI literacy. And we should also be able to provide more privacy to all these people. The final thoughts about combining all these things is that we have to bring in some kind of collaborative work between the different stakeholders who are involved in developing and deploying these systems. And the governments have already given certain prompt knowledge about how to enable all these things through the techno -legal framework and guidelines that was recently published and the AI governance guidelines.
Which was recently published by METI. So the Southeast Asian countries can learn from the developing countries like India and then have curated a more tailored approach towards their unique needs. So that is what I think. So whoever has an opportunity or a willingness to have more things that will actually help them use or leverage these technologies can learn from whatever. Learn from. the mistakes as well as the experience that the other countries have, which is now openly available through all these summits.
That’s very useful and I’m looking forward to following up on IIT Madras’s work in this front as well. Going to Suresh for kind of the last question in this series really, should, you know, safety measures, evaluations, primarily focus, where should the focus be at the model level? And you talked about upstream quite a bit. Should there be more broader socio -technical readiness measures, misuse considerations? Where do you think it should be?
And also, very importantly, how we have to also see it from the context of, you know, people doing their own thing, DIY kind of science that happens. And also, small -scale commercial activities which are not fully under the oversight mechanism of the government, right? So, considering all of these points, right, the policy evaluation must expand from model -centric assessment to socio -technical assessment. And this would include, you know, evaluating things like how much capability uplift relative to the government capacity that is there. So, government has certain capacity to manage or do oversight, but these AI tools, how are they changing that? Incentive structures, very, very important, that shape the model deployment. Also, the diffusion of risk across borders.
All of these things don’t respect national borders, right? So, how it’s going to spread. If people using VPN or other things, a number of other things that are there. So integration, lastly, the integration with existing biosafety and resource security systems as I had already mentioned. So briefly, like performance evaluation is necessary, but governance -relevant evaluation must be systemic. And otherwise, we risk auditing algorithms while ignoring the institutions that operationalize them. And that is very, very important, how we focus on that institutional level mechanisms. Thank you. Thank you.
Piti, kind of the last structured question before we move into a bit more of an open conversation. AI becomes embedded not just in new capacities, but also existing programs like biosurveillance, public health systems. And so there’s a mix between emerging kind of scientific knowledge with more legacy, let’s call engineering knowledge as well. So. So how do we make sure that safety, evaluation, interoperability, all of that exists in this divide without fragmentation happening across the ecosystem? Because, you know, you can easily imagine everyone’s doing their own AI, you know, safety evaluation and not necessarily talking to each other.
Thank you, Shyam. I think this is a very important question. And it’s also a topic that I’m really passionate about as well, which is biosurveillance. To your point, I think, you know, countries are already deploying AI -enabled biosurveillance systems that are, you know, either syndromic surveillance or it could be, you know, genomic sequencing pipelines or outbreak modeling. The countries are already doing that, but they are not building on… the unified data standards. So they’re basically building on very incompatible data standards with very different legal regimes across the borders. We’ve seen that in Southeast Asia. We’ve seen that even countries like, for example, Singapore to Malaysia, you see different legal regiments on how they monitor the data and also the biosurveillance.
And so the fragmentation risk is actually not a technical risk, I would argue, because it’s not just a technical risk, because we’ve seen COVID. I feel like if anyone is anybody saying, I think we all were a little bit traumatized by COVID. We’ve seen how data hoarding and incompatible reporting actually cost lives. And I saw that especially happening across the region in the lower resource settings. Like countries like Cambodia, for example. AI systems that are trained on non -representative data obviously are going to perform much worse. And guess what happens? When they perform worse, the region that is most affected is the region that needs the help the most. And because of that, and also that region is also the same region with the least data infrastructure.
And so I guess to sort of like answer your question and what I think we need to do, I think there are three things to be addressed here. The first one is obviously the data standards harmonization. Currently, we don’t have that. I think we would need not like a global overhead standard that enforces on every country, but more of a federated interpretability that applies frameworks that applies to different countries. So I can think of like HL7FHIR, which is the federated… healthcare interpretability resources that are attempting to address these very specific issues on clinical data, but this one would be adapted for public health surveillance. And the second point is the legal safe harbors for basically just kind of cross -border sharing of data for public health emergencies that are negotiated beforehand because, and this is important, beforehand, because if you negotiate during an outbreak, people are going to be freaking out.
People are going to be like, I’m not going to share my data to you. What are you going to do with that data? So this needs to be done beforehand. And the last point, and also the most politically challenging point, is actually to have some kind of shared evaluation criteria across the board between different countries that are embedded into the national surveillance systems. And, for example, like Singapore data infrastructure environment might not apply to countries with like different climate data or like different demographic data. So this needs to be applied into, you know, the national surveillance systems. And what I noticed, I guess like the last message is that what I noticed the AI governance framework often thinks of biosurveillance as like an edge, like a niche edge case.
And then people in biosecurity frameworks, like doing biosecurity frameworks, thinks that AI governance is like a tool. And these people don’t talk to each other. And that gap, that gap right there is where the risk happens. So, yeah, we just need to talk to each other more. That’s easier said. Yeah.
So I think I’m just about to close with maybe five minutes or just under that for audience questions. Thank you, Justin. 10 second final thoughts on each of you from the panel. Suresh.
Just wanted to very quickly, we need to also keep in mind that how AI could help solve some of these AI safety challenges. How agentic AI could be used, let’s say, when people are trying to develop vaccines. CEPI has developed this platform where agentic AI is being used to check if there is someone who is trying to jailbreak or someone who is trying to misuse the tool that is there. Second very quick point, also, with all that what I said, there is still a gap to transfer things from digital to physical, what is called digital to physical barrier. So, even if you have everything, you still can’t just develop, modify viruses without having a proper physical infrastructure and there are still some ways to control that.
Thank you.
I think we should move on transforming from issues to intelligence like learning from the risk that happens and feeding it back to the model training and other assessment activities to mitigate the risk in real time so that is where we need to move towards bringing in more people into evaluations and then making it safer for people to use
I guess I’ll make it quick the point that I want to make here is that Shurya should echo his point I think you’re right that we should not shoot ourselves in the foot especially for developing countries, I think it’s really important and so my message for the last message here is just kind of like while we are forging ahead in innovation and while we are innovating ahead in whatever domains of scientific domains that we’re doing we need to be conscious of the impact that we have and I think in the AI Impact Summit is one of the really good places to jumpstart that kind of conversations and break the silos. Thank you.
Thank you everyone. I’m just going to take probably one minute to kind of summarize key points Evaluation, I see largely a systemic question, safety measure systemic question. I especially like the point on incident response not being already there. And a couple of points on the cross -border solutions and problems, we already have that. Discussion on open signs, we talked about how managed access, safeguards, and comparing government capacity to manage that versus letting it out for more DIY -oriented signs, which is a good term, I really like that. That’s a key area. And for emerging scientific powers, of course, collaboration is key. Tailored approach, that’s something that I’m again waiting to see from IT Madras as well, their contribution on this.
And some cross -border work on legal safe harbors, data standard harmonization, PT that you mentioned, really land well from this panel. I’m going to… I’m going to stop my… summary right now and you know more of this would be kind of put together in a blog at some point in the uh nearby future uh perhaps uh we can go for questions uh first uh yes please i think i can give you mine
Thank you so much for your wonderful insights i really enjoyed this session as a researcher in safety of ai at the university of york so i focus on psychological harms of ai and so what i want to ask particularly gita is um when it comes to the definition of harms and traditional safety engineering they’re catering more to physical harms and now we see the whole spectrum of harms expanding beyond that so i would love to know the work being done by karai and you in this area and and in fact enrich my research with it
Yeah, sure uh so when we actually assess harms and impacts right we do we have to do it from the different two different perspectives one is on the functional side where we assess all these algorithmic risk and other stuff. From the human centric perspective, like you said, we can keep doing everything from the psychology perspective and other ethics and other stuff. So, here at CIRI, we do work on assessing bias, determining whether the model is stereotypical or not and how do we generate explanations for the high level scientific models and all. So, from the perspective of the psychological things, there is this cognitive science or cognitive capabilities of AI models which will actually enhance or degrade the capabilities of humans.
So, those things are we are trying to do some assessments from the incident perspective. So, if you go to read the incident reporting framework that we have, we have a taxonomy of risk and harms and also the impacts. So, from the kinds of harms that we have defined, we have categorized it as physical, psychological, cyber incident based harm set. And moreover, we have all the generic kinds of harms like algorithmic harms, socio -economic harms, the environmental harms and all. So, we are trying to come up with a taxonomy that will cater to the different hierarchies that will be applied to these kind of harms and impacts which will again be model specific, use case specific and the domain specific.
So, that is where we are trying to work on. And we also have a healthcare based tool, a toolkit which will enable people to actually assess the perceptions of how they treat these models, how they see whether these AI applications are helpful for them or not and then come up with some capacity building programs for different roles in which they are working on. And this has been done with CMC Wellure Hospital and we have been assessing the perceptions of healthcare workers. and then come up with a training module which will enable them to use AI models or tools more confidently rather than, say, being resistant or not relying on them for so much.
Last, probably last quick question. Maybe keep it short on the responses as well, please. Sorry.
Hi. So my question is about, like, we are discussing all the geographical barriers, right? The modality is geography. When we change the geography, the models tend to perform poorly. Are we concerned about the temporal modality as well? When we go forward in time, the data is going to change eventually, and that is going to affect modeling. And how do we plan on, like, you know, mitigating such a problem if it arises?
Yeah. So this comes under the model monitoring, the system monitoring approach, where we consider the data drifts out of distribution. So we consider the distribution aspects of the data and models. So definitely this is one of the criteria where you assess safety and evaluate the impacts of it.
Yes, I think last question
Thank you so much for the insightful discussion, really appreciated the expertise that you’re bringing to the topic and thanks PT for bringing up COVID because my question is about that. As we learn from COVID biosecurity risk can quickly become a cross border existential threat. So what would a successful web of prevention and incident response framework look like and who are you looking up to in this space? Like who’s doing it well in this space?
I can start maybe PT can add. So I think as I was mentioning, it will have to be more decentralized but at the same time integrated to the leadership. So I think there needs to be more empowerment of people who are like biosafety officers in the lab or who are institutional biosafety committee members, who are people who are working on the ethics and research security side at the institutes. So those are the people who need to be empowered. So there needs to be more capacity building of those people and at the same time there needs to be a mechanism established so they can report those incidents to the very top and there is top leadership sitting in the capitals.
They can in some way get an overview or monitor the situation as it is going on at different institutes level.
Thanks. I can add a little bit to that. So in Singapore we actually have different agencies responsible for this. So we have the National Environmental Agency and then we have the MOH, obviously the Ministry of Health and then we also have different smaller agencies like Communicable Disease Agency and also like Prepare Agency where they are responsible for different tasks. But I want you to envision this as almost like the way that Singapore is trying to establish itself. I think it’s trying to establish itself almost as a firefighter. So when there’s an incident where there’s a crisis, who is actually doing what is very clear but it’s always not always clear across like different countries. For example, in Laos, Vietnam, might be looking very different, but I think having a very coordinated response across the different agencies on who is doing what.
Like, for example, National Environmental Agency is responsible for wastewater surveillance. So monitoring how the sickness is increasing or spiking or not, those are the people, yeah, that you would look up to. And I think that’s the last word, right? It all comes down to prevention and preparedness, even in this much like anything else with biocontext.
Thank you, everyone, for the question, and thank you to my brilliant panelists, Suryesh, Geeta, and P .T. This was a very insightful discussion. On the screen is the work from RAND Europe with CLTR, some of what was referred to by P .T. and other panelists as well, some aspects of what we were discussing about risk typification. You’ll probably get some ideas there as well. And with that, I close. I’m surprised. I’m supposed to hand over these mementos to apparently including me, so let us do that now. Thank you.
-Shifting Risk Landscape in Life Sciences: The discussion highlighted how AI biodesign tools and LLMs are fundamentally changing biosecurity risks by decoupling them from traditional physical containm…
EventShifting Risk Landscape in Life Sciences: The discussion highlighted how AI biodesign tools and LLMs are fundamentally changing biosecurity risks by decoupling them from traditional physical containme…
EventAGIS Reports analysisemphasises that as AI systems become pervasive, they create significant global challenges, including surveillance risks, algorithmic bias, cyber vulnerabilities, and environmental…
Updates“AI governance now faces very similar tensions.”<a href=”https://dig.watch/event/india-ai-impact-summit-2026/advancing-scientific-ai-with-safety-ethics-and-responsibility?diplo-deep-link-text=How+do+w…
EventRaised by:Mr. Abhishek Singh This highlights the need for practical evaluation tools and capacity building mechanisms that can work across diverse linguistic and cultural contexts in the Global South
Event-Need for multilingual and multicultural evaluation systems: The discussion emphasized developing benchmarks beyond English-language models, creating evaluation tools that capture societal risks speci…
Event“We publish model cards and evaluation benchmarks and data so you can see how they work, their intended use, and how we assess their performance.”<a href=”https://dig.watch/event/india-ai-impact-summi…
EventEvidence:Described the multi-step pipeline of red teaming requiring human involvement at gap identification, prompt creation, and response evaluation stages. Noted that red teaming is continuous as ne…
EventPromoting policies that enable responsible and interoperable cross-border data transfers, access, and sharing is of paramount importance. Such policies not only reinforce trust but also boost data-dri…
EventCapacity building for policy oversight and management of partnerships is considered crucial. Government institutions need the necessary capacities to provide oversight over cross-border data sharing. …
Event– **Meri Sheroyan** emphasized continuing to pilot small-scale cross-border data-sharing initiatives in specific sectors Audience: Thank you all and very excellent panelists. And all the points actua…
Event“The moderator began by asking whether the challenges of AI‑enabled biodesign should be framed primarily as a data‑governance problem, a model‑design issue, or a verification‑and‑compliance matter.”
The knowledge base records that the moderator provided the opening framing for the discussion (Keynote-Rishad Premji) [S60].
“Risk governance has traditionally relied on physical controls such as lab inspections and material‑transfer agreements, but AI‑enabled biodesign tools are moving risk upstream to the design phase.”
The knowledge base notes that AI has fundamentally altered where risks originate in biological research, shifting attention from traditional physical controls to upstream, design-phase considerations [S21].
“Data governance, model evaluation and red‑team exercises remain essential for managing AI‑enabled biodesign risks.”
Red-teaming is highlighted as a critical, human-intensive process for identifying system gaps and scaling evaluation methods, underscoring its essential role [S56].
“Model cards, evaluation benchmarks and feedback loops are used to flag potential risks and improve AI models.”
The knowledge base describes the practice of publishing model cards and evaluation benchmarks to provide transparency and create feedback loops that can surface risks [S35].
“AI technology could facilitate the development of chemical or biological weapons, creating new security challenges.”
UK Prime Minister Rishi Sunak’s remarks emphasize that AI may enable the creation of chemical or biological weapons, supporting the claim that AI-enabled biodesign introduces novel security concerns [S66].
There is strong convergence among the panelists on the need for decentralized, capability‑based governance, early pre‑deployment safety checks, capacity building in the Global South, harmonised data standards with legal safe‑harbours, and robust multi‑agency incident‑response frameworks. These shared positions span technical, policy, and institutional dimensions.
High consensus – the speakers largely reinforce each other’s proposals, indicating a collective readiness to pursue coordinated, context‑sensitive governance mechanisms. This consensus suggests that future policy work can build on these common foundations rather than reconciling divergent views.
The panel shows substantial disagreement on governance architecture (centralized institute vs. decentralized local checks), on the degree of access restriction for high‑risk AI tools (open‑source freedom vs. tiered access), and on the primary timing of safety assessments (pre‑deployment embedding vs. periodic post‑deployment monitoring). There is also a conceptual split between technical risk classification and socio‑cultural benchmarking. While participants converge on the need for oversight, incident reporting, and AI‑driven monitoring, they diverge on implementation pathways.
Moderate to high disagreement. The divergent views on centralisation, access control, and assessment timing could lead to fragmented policy approaches if not reconciled, potentially weakening collective biosecurity safeguards and slowing coordinated action across the Global South.
The discussion was driven forward by a series of pivot points that moved the conversation from abstract risk identification to concrete governance architectures, regional capacity considerations, and technical‑legal solutions. Speaker 1’s framing of a upstream, design‑centred risk landscape forced participants to rethink traditional biosafety models. PT’s tiered‑access and institutional‑institute proposals supplied actionable policy scaffolding, while Geeta’s emphasis on AI readiness and socio‑cultural fit highlighted the inequities that any global framework must address. The later focus on data‑standard harmonisation and legal safe‑harbours broadened the scope to system‑level coordination, linking AI safety to public‑health infrastructure. Collectively, these comments reshaped the dialogue into a multi‑layered, globally inclusive roadmap rather than a single‑track regulatory narrative.
Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.
Related event

