Digital Democracy Leveraging the Bhashini Stack in the Parliamen

20 Feb 2026 12:00h - 13:00h

Digital Democracy Leveraging the Bhashini Stack in the Parliamen

Session at a glanceSummary, keypoints, and speakers overview

Summary

The session focused on building an inclusive, open-source voice AI ecosystem for India, emphasizing the need to continuously adapt technologies to diverse languages, cultures and users [1-13]. Amitabh Nag highlighted that AI solutions have a short “shelf life” and must be regularly upgraded because there is no warranty for static systems, especially given the vast linguistic and cultural diversity across the region [5-8][9-13].


Ariane Ahildur introduced the newly released Policy Report and Developers Toolkit, describing them as a joint German-Indian effort that provides best-practice guidance and embodies a shared vision of digital inclusion through voice technology [24-38][42-44]. She stressed that voice interfaces are crucial for low-literacy populations and that responsible, multilingual voice AI can unlock access to public services, aligning with the Hamburg Declaration on AI for Sustainable Development Goals [36-41][49-52].


Harleen Kaur outlined a four-pillar policy framework-treating foundational data as public goods, institutionalising sustainable open-source infrastructure, building open and representative models, and strengthening responsible deployment [73-78]. The accompanying developer toolkit translates these principles into practice by focusing on representation planning, data-quality assurance, and embedding responsible AI throughout the development lifecycle [90-94][97-101].


In the panel, Nag described two main pathways for sustaining data creation: large-scale “brute” collection of diverse speech samples and the generation of improvement corpora from deployed products, including both open-domain and closed-domain sources [121-138]. Ghosh argued for a smarter, cost-effective approach that leverages intrinsic linguistic components rather than exhaustive data gathering, illustrating this with a Telugu project that covered four dialects by identifying common acoustic features and supplementing them with targeted data [154-168][174-184]. Kritika emphasized that industry adoption requires scalable, edge-ready infrastructure, domain-specific model fine-tuning, and compliance safeguards to ensure reliable deployment across sectors such as healthcare and manufacturing [190-199]. Thomas highlighted the intersecting legal challenges of privacy and copyright, urging robust documentation, privacy-enhancing techniques, and clear licensing from the outset to build a trusted ecosystem [205-224]. Ghosh warned that human transcription variability makes traditional word-error-rate metrics insufficient, proposing multi-layered, subjective-objective evaluation methods and downstream feedback loops [228-241]. Nag reinforced that ultimate acceptance of voice systems rests on audience perception rather than absolute rankings, suggesting that standards should be shaped by what end-users deem understandable and trustworthy [256-272].


The participants agreed on the need for a unified, nationally coordinated evaluation framework-potentially a single leaderboard-to drive continuous improvement while fostering collaborative competition [315-321]. The discussion concluded that aligning policy, technical, legal and evaluation efforts is essential to realize inclusive, responsible voice AI that serves India’s diverse population [24-38][73-78][205-224].


Keypoints


Major discussion points


Dynamic, user-driven data ecosystems are essential for sustainable voice AI.


Amitabh Nag stresses that foundational speech datasets must be continuously created, enriched through user feedback, and treated as digital public goods to keep models improving over time [121-138]. Nihar Desai later summarizes this as “data sets need to be more of lived-in nature… built upon by users” [146-148].


Inclusive language coverage requires smart, cost-effective collection strategies rather than brute-force data gathering.


Prasanta Ghosh explains that Indian linguistic diversity can be addressed by focusing on intrinsic language families (Indo-Aryan, Dravidian) and balancing data volume with coverage [155-168]. He illustrates the approach with the Telugu dialect project, showing how a “region-anchored” method reduces time and budget while preserving diversity [174-183].


A four-pillar policy framework and a developer toolkit translate inclusive AI principles into practice.


Harleen Kaur outlines the policy pillars: treating foundational data as public goods, institutionalising sustainable open-source infrastructure, building open and representative models, and strengthening responsible deployment [73-78]. The accompanying toolkit operationalises these pillars through guidance on representation, data quality, and embedding responsible AI (RAI) throughout the development lifecycle [90-108].


Legal and governance safeguards (copyright, privacy, documentation) are critical to protect trust in the ecosystem.


Thomas Vallianeth highlights the intersecting challenges of copyright and privacy, urging early-stage provenance checks, privacy-enhancing techniques, and robust documentation to enable safe downstream use [208-218][221-224]. He later notes that while the law can accommodate some subjectivity, clear evidence and trust-building measures are needed [286-298].


Evaluation of voice models must move beyond single-metric, objective scores to a multi-layered, ecosystem-wide approach.


Ghosh points out the variability in human transcription and argues that word-error-rate alone is insufficient; instead, multi-output models, subjective human review, and downstream-application feedback should be incorporated [228-240]. Nag adds that ultimate acceptance hinges on audience perception rather than absolute rankings [256-273], and participants call for a national, collaborative benchmarking framework [315-319].


Overall purpose / goal


The session launched the Policy Report and Developers Toolkit “Building on Open and Responsible Voice Technology Ecosystem in India” and served to (1) showcase the Indo-German partnership that produced the report, (2) present a concrete policy framework and practical toolkit for inclusive voice AI, and (3) mobilise stakeholders-government, academia, industry, and civil society-to adopt open, responsible, and culturally diverse voice technologies that advance public services and sustainable development.


Overall tone and its evolution


– The discussion begins with a formal and optimistic tone, celebrating collaboration and the report’s release [24-34].


– It then shifts to a technical and problem-solving tone as participants detail challenges in data collection, linguistic diversity, and legal compliance [65-84][208-218].


– Mid-conversation the tone becomes reflective and candid, acknowledging the inherent uncertainties, subjectivity, and “no-warranty” nature of AI systems [8-15][256-273].


– The closing remarks adopt a constructive and forward-looking tone, urging continued workshops, benchmarking, and ecosystem-wide trust mechanisms [302-319][322-327].


Overall, the dialogue remains collaborative and solution-oriented, moving from celebration to deep analysis and finally to actionable next steps.


Speakers

Ariane Ahildur – Dr.; Director General, Department for Global Health, Equality of Opportunity, Digital Technologies and Food Security, German Federal Ministry for Economic Cooperation and Development; expertise in global health policy, digital technologies, and food security. [S2]


Nihar Desai – Head of JNI; Moderator of the panel discussion; expertise in moderation and digital initiatives. [S3]


Moderator – Session moderator (unnamed); role: moderating the event.


Kritika K.R. – Head Artificial Intelligence and Product Researcher, SanLogic; expertise in applied AI and product research. [S8]


Prasanta Ghosh – Dr.; Associate Professor, Indian Institute of Science; expertise in speech technology research and academia. [S9]


Thomas J. Vallianeth – Counsel, Trilegal; expertise in legal aspects of AI, copyright, and data governance. [S11]


Harleen Kaur – Research Manager, Digital Futures Lab; expertise in policy research and developer-toolkit development. [S12]


Amitabh Nag – CEO of DIBD (also referenced as CEO of Bhashini); expertise in AI ecosystem building and voice technology. [S13]


Additional speakers:


Shailendra Pal Singh – Senior General Manager, Bhashani; role: felicitate the speakers at the event.


Full session reportComprehensive analysis and detailed insights

Opening Remarks – Amitabh Nag


Nag opened by stressing that any AI-driven voice solution must be scalable across regions such as Southeast Asia and Africa and continually refreshed, noting that a model’s “shelf-life” can be as short as three to six months [1-5]. He contrasted AI systems with static machines, pointing out that there is no warranty or guarantee for AI models and that diversity of people, languages and cultures makes inclusion a core design requirement rather than an after-thought [6-13]. Nag concluded that progress will be incremental, moving step-by-step toward higher levels of inclusion [17-19].


Keynote – Ariane Ahildur (Director General of the Department for Global Health, Equality of Opportunity, Digital Technologies and Food Security, German Federal Ministry for Economic Cooperation and Development) [24-26]


Ahildur launched the Policy Report and Developers Toolkit “Building on Open and Responsible Voice Technology Ecosystem in India.” She thanked Digital Futures Lab, Art Park, TriLegal, and NASSCOM as key partners [33-36]. The report, a product of a German-Indian partnership, offers best-practice guidance and hands-on advice for policymakers and the tech community [30-32]. Ahildur framed voice AI as a gateway for low-literacy populations to access public services, health care, education and economic participation, warning that failure to provide multilingual voice interfaces can reinforce exclusion[34-41]. She linked the initiative to the Hamburg Declaration on Responsible AI for the Sustainable Development Goals, underscoring that AI should serve people and the planet [49-52].


Report & Toolkit Presentation – Harleen Kaur (Research Manager, Digital Futures Lab) [73-78]


Kaur outlined the four-pillar policy framework:


1. Treat foundational datasets as public goods;


2. Institutionalise sustainable open-source infrastructure;


3. Build open and representative models;


4. Strengthen responsible deployment.


She explained that treating data as a public good means government funding and convening for languages that are not commercially viable[79-81]; institutionalisation involves standardised documentation, collaborative data-steward models and shared national compute resources[82-85]; the third pillar calls for locally curated benchmarks and representative models[86-88]; and the fourth stresses public-value sharing, community buy-in and literacy to prevent misuse[85-88].


The accompanying developer toolkit translates these pillars into practice, focusing on representation planning, data-quality assurance, and embedding Responsible AI (RAI) throughout the development lifecycle[90-108]. Practical recommendations include maintaining a diversity wish-list, using synthetic data, adopting a layered data-strategy (active, passive and synthetic sources), applying robust transcription standards, and implementing continuous post-deployment monitoring[97-111].


Panel Moderation – Nihar Desai (Head, JNI) [122-124]


Desai moderated the discussion and opened with the question: Should foundational datasets be treated as digital public goods, and how can a data-flywheel be created to sustain them?


Data-Creation Strategies – Amitabh Nag[124-148]


Nag described two complementary pathways:


* Traditional “brute-force” field collection that captures diverse speech samples across regions and dialects;


* Product-derived corpora generated automatically from models, including open-domain sources (e.g., YouTube) and closed-domain feedback loops from enterprise or government applications.


He argued that a flywheel of data generation and feedback is essential because datasets must be “lived-in” rather than static[146-148].


Linguistically-Informed Sampling – Prasanta Ghosh[155-184]


Ghosh proposed a cost-effective, language-family-first approach: start from the major families (Indo-Aryan and Dravidian), identify common acoustic components, and then target specific dialects. Using the ResPin Telugu project as an example, his team covered four dialects by first collecting data that captured shared acoustic features and then supplementing with targeted recordings, thereby reducing timeline and budget while preserving diversity[174-184]. This “region-anchored” strategy demonstrates how smart sampling can replace exhaustive data gathering[160-168].


Industry Perspective – Kritika K.R. (Head of AI & Product Research, SanLogic) [190-214]


K.R. highlighted the need for scalable, edge-ready infrastructure and domain-specific model fine-tuning to enable reliable deployment in sectors such as healthcare, manufacturing and automotive. She stressed that model optimisation for device-level intelligence, combined with compliance safeguards, allows open-source models to be deployed on-premise, protecting sensitive data while supporting industry-specific vocabularies [200-207][208-214].


Legal & Governance – Thomas J. Vallianeth[208-224][289-301]


Vallianeth outlined three legal dimensions:


1. Copyright provenance & licensing – even publicly available voice datasets may be subject to copyright and require provenance checks and appropriate licences;


2. Privacy-enhancing techniques at the point of collection to avoid storing personal data;


3. Robust early-stage documentation to provide downstream users with trust and evidentiary support in any legal dispute.


He warned that subjectivity in AI outputs will increasingly surface in courts, and that pre-emptive safeguards and transparent processes can mitigate such flashpoints[289-301].


Evaluation Debate


* Ghosh noted that human transcribers rarely agree word-for-word, making word-error-rate (WER) insufficient; he advocated for multi-layered evaluation that includes multiple hypothesis outputs, subjective human review and downstream task performance[228-244].


* Nag complemented this by asserting that acceptability is determined by whether the end-user understands the output, and that different contexts (e.g., courts versus casual conversation) demand different levels of linguistic purity [256-279].


The panel reached consensus on the need for a national, collaborative benchmarking system – a single leaderboard under “Varshini” – to drive competitive yet cooperative progress across languages and dialects [313-321].


Broad Consensus


Participants agreed that:


(i) Voice technology and speech datasets should be treated as public goods;


(ii) Continuous, feedback-driven data enrichment is essential;


(iii) Open-source governance and sustainable infrastructure must be institutionalised;


(iv) Evaluation must move beyond single-metric scores to multi-dimensional, context-aware frameworks; and


(v) Legal safeguards, documentation and privacy-by-design are prerequisites for trust[1-3][19][73-78][90-108][208-218][256-270][313-321].


Actionable Take-aways


Adopt the four-pillar framework and publish the developer toolkit to embed RAI practices.


Establish a continuous data-flywheel that combines field collection, product-derived improvement corpora, and a layered data strategy.


Convene regular workshops to co-design a national, multi-layered evaluation framework and an annual leaderboard under Varshini.


Implement early-stage documentation, licensing checks, and privacy-by-design measures to satisfy legal requirements.


Encourage governments to act as ecosystem stewards, funding non-commercial language projects and maintaining open-source infrastructure [73-78][79-88][90-108][121-148][208-218][313-321].


Conclusion


The launch of the Policy Report and Developers Toolkit marks a concrete step toward an inclusive, open-source voice AI ecosystem for India that can be replicated globally. By aligning policy, technical, legal and evaluation efforts, participants underscored that continuous, community-driven data creation, responsible governance and user-centred evaluation are the pillars upon which sustainable, equitable voice technologies must be built [24-52][73-78][90-108][121-144][256-270][313-321].


Session transcriptComplete transcript of the session
Amitabh Nag

including, you know, Southeast Asia as well as Africa and other places. So from that perspective, it is very important that we scale these solutions. We have policies, standards, toolkits which are developed which can be actually replicated. And frankly speaking, in this area, in this situation, nothing is static. You have a shelf life which is sometimes three months or six months or even less. Yes. So we have to continuously upgrade the things as we go by. You know, we can’t be saying that this is what we have done, unlike a machine which we have built up and it works for six years or five years. There is no guarantee, no warranty in these kind of systems which we are building in AI.

AI, and the reason for this is diversity. You know, each person is different. Each language is different. Each culture is different. So there is… There is huge amount of diversity and we have to live with the diversity unlike the earlier digital systems which used to work on only standards. You know, they had standards and they would perhaps keep the outliers away. Here, inclusion is the name of the, inclusion is part of the design, diversity is part of the design. And we would perhaps have to go step by step to define those diversities so that they start becoming standards. Right. You know, it’s a very different kind of a setup which is there and happy to be part of this journey, happy to, happy and acknowledged to the help which is being provided.

And hopefully we are going to get across to the next level and higher steps in the journey as we go by in future. Thank you very much.

Moderator

Thank you, Mr. Nag for your insightful words and also for your incredible support throughout the last year over the course of the program. Right. Thank you. I will now invite Dr. Ariane Ahildur -Brandt, Director General of the Department for Global Health, Equality of Opportunity, Digital Technologies and Food Security of the German Federal Ministry for Economic Cooperation and Development to deliver the keynote address. Thank you. Thank you.

Ariane Ahildur

Dear Mr. Naack, dear partners, distinguished guests, it is a great pleasure to welcome you to this launch today. We present to you the Policy Report and Developers Toolkit Building on Open and Responsible Voice Technology Ecosystem in India. The report and the toolkit are the impressive result of a very productive partnership between Germany and India. And it is the result of a joint effort involving a group of distinguished partners and experts. This is why I would like to start by thanking you, Mr. Nack, and your colleagues from Ascini, for the excellent cooperation. And I would like to thank the Digital Futures Lab, Art Park, TriLegal, and NASSCOM for their invaluable support. Dear guests, you will find that the report and toolkit that we are presenting today is full of best practices and lessons learned.

It will provide guidance and hands -on advice to policymakers and to the tech community alike. But for me, this report is more than useful and more than practical content. It also conveys a shared conviction, shared values, and a shared vision for digital inclusion. In fact, when it comes to inclusion, voice technology has a key role to play. For millions of people. Voice is the most natural and powerful interface to the digital world, especially for those with limited literacy or access to digital devices. When voice AI works in local languages and dialects, it will become a gateway to public services, healthcare, education, and economic participation. When it does not, AI risks reinforcing existing devices and may even become an instrument for exclusion.

This is why responsible, inclusive voice AI is not just a technical issue. As I said, it is part of a shared vision, a shared vision between India and Germany. At a time when artificial intelligence is often framed as a global competition, this report offers a different narrative, and this is a narrative of cooperation. The Indo -German Partnership on AI, and particularly on language, and voice technologies shows what is possible when we join forces. Together with Bashini and the Indian Institute of Science, our initiative Fair Forward has created open voice technologies for nine Indian languages. These language models can now be used by NGOs, state agencies and companies. For example, they can be integrated into voice assistance for health workers, which in turn can improve health care for women.

Or they can be used to advise farmers on crop management. This collaboration, based on the principles of openness, fairness and responsibility, is the foundation for AI that truly serves the common good. And it contradicts those who claim that only fierce competition can generate prosperity and innovation. Ladies and gentlemen, this approach, closely aligns with the principles articulated by the International Cooperation on Climate Change. in the Hamburg Declaration on Responsible AI for Sustainable Development Goals. This declaration, presented by BMZ, our ministry, and UNDP last year, has been endorsed by more than 50 stakeholders already, including governments, international organizations, NGOs, and companies. The declaration reminds us that AI should serve the people and the planet, strengthen inclusion, and support sustainable development.

And our report here is a very practical and relevant contribution to that agenda, translating shared principles into concrete guidance. So let us thus deepen cooperation, strengthen trust, and build voice technologies that truly speak to everyone. Thank you for your attention.

Moderator

Thank you so much, Dr. Hillbrand. We shall now move on to the formal launch of the report and toolkit. I’ll invite all the representatives of the consortium from GIZ, Tri -Legal, Art Park, NASSCOM, Digital Futures Lab to please come on stage. And Mr. Nag to present the data. Thank you. Thank you. Thank you. Thank you. Now that we’re done with the formal launch of the report and policy toolkit, just to give you a brief overview, I invite Ms. Harleen Kaur, Research Manager, Digital Futures Lab, to present the report.

Harleen Kaur

Good morning, everyone, and thank you for being present. on a Friday morning for the launch of this report, as well as the developer toolkit. So I’ve linked the outputs in case you’d want to see them. If you can take a quick photo, and I’ll move towards discussing the high points of the findings that we had both for our policy report as well as developer’s toolkit. So when we began this work last year, we found that the challenges that are there in the voice tech arena, they are not limited to data collection alone. So the challenges are multi -layered that start right at the data collection stage and curation stage, but then move on to model development, where we see linguistic diversity gaps, lack of standards, uneven documentation, unclear data ownership and structures being a problem.

But then when we move on to the, hosting and licensing aspect, long -term infrastructure costs, costs, governance of open source assets, as well as sustainability of shared resources is something that we felt was a very important problem that needed to be solved in a certain manner. And the last is downstream deployment and impact, where bias, exclusion and lack of accountability for misuse become more visible. All of these are essentially starting at the data collection stage, but they move on to the life cycle of the voice technology ecosystem in India, specifically when you feel like supporting an open voice ecosystem in India. To lay down our approach for this project, we thought about how can we move on from the traditional government systems where government has primarily acted as a regulator, it enforces rules, it corrects market failures, to a newer active role, and that we have seen with Bhashani.

We encourage governments across the world to adopt this framework where the government acts as a steward of public good. ecosystem convener, as well as a standard setter, not just through licenses, but actually through practice as well. This is the overview of our policy framework. Based on this approach, we have structured our policy framework around the four pillars that you see on the screen. The first is treating foundational data sets as public goods. Second is institutionalizing sustainable open source infrastructure. Third is building open and representative models. And finally, strengthening responsible deployment. And what do we mean when we say this? When we say treat foundational data sets as public good, we are saying that government should be encouraging both funding and convening for public good functions.

For example, supporting languages that are not commercially viable as such. Institutionalizing governance. Governance framework. Thank you. to strengthen RAI practices, for example, through procurement, etc. On open representative models, we believe that local and contextually relevant benchmarks that are curated by government bodies not just at the center, but at the relevant diversity ecosystem, whether it is state, district, etc., is important. Shared national compute infrastructure, preferential treatment to open source ecosystem is something that we propose. On open source infrastructure itself, standardization of documents and promoting collaborative data steward models is something that has already been written in the report. Strengthening responsible deployment, public value sharing is another aspect of the report. We believe that public value sharing comes not just from financial arrangements, but also a buy -in of communities into what kind of…

uses of voice technology are there. And of course, supporting public literacy to protect against misuse and preventing harms is the policy side of our suggestion. Moving on to developer’s toolkit. You know, policy intent alone does not ensure inclusive AI systems. So alongside the policy framework, we’ve developed a developer toolkit that translates some of these principles into practice for developers. So it focuses on three broad areas, representation being the foremost through diversity planning, et cetera. Second being data quality and evaluation. And the third one being embedding RAI practices throughout the lifecycle of development of open voice I’ll just give you a brief overview of what we mean when we say this. So for developers, we have a toolkit that includes best practices that we’ve seen in industry.

And we have a toolkit that we’ve seen in India and outside on what does it mean to ensure adequate representation. on what does it mean to ensure adequate representation. So we have a toolkit that we’ve seen in India and outside So we have a toolkit that we’ve seen in India and outside on what does it mean to ensure adequate representation. Things like having a diversity wish list, making sure that you’re not collecting data from one source, applying linguistic expertise, using synthetic data, training model for linguistic and environmental nuances, and also layered data strategy. Which again means that don’t just use one source of data. Don’t do active or passive collection alone. Use a hybrid layered structure to make your models more diverse.

Once the developer move on from data collection to curation, we suggest many, many ways. This is just a very bird’s eye view overview in which data quality can be enhanced in the constraints that we operate in, in countries like India. And there are suggestions to make the applications inclusive and useful in practice, including robust transcription standards, contextual benchmarks. using data cards, model cards that are standardized, as well as continuous post -deployment monitoring. You can find more details in the report itself. And the last aspect of the developer’s toolkit is actually embedding RAI practices. We’ve taken another lifecycle framework within this where we believe that RAI practices are not the domain of policy alone. At enterprise startup developer level, ensuring a framework that serves to support them by providing them clarity on what does it mean when we say your output should be responsible.

So things like be mindful of engagement with the communities from whom you are taking data, annotation is happening, consent protocols, privacy enhancing techniques. So this report essentially is compliance plus. It actually shares practices that we believe are useful to promote open, responsible AI voice technology ecosystem. Please feel free to engage with the reports We’ll be very happy to take your comments, suggestions Thank you so much

Moderator

Thank you, Harleen We shall now move on to a short panel discussion On voice technologies in India Unpacking the present and future Of the voice AI application ecosystem For India and beyond Joining us today, I will invite to the stage Mr. Amitabh Nag, CEO of DIBD Dr. Prasanta Ghosh, Associate Professor At the Indian Institute of Science Ms. Kritika K .R., Head Artificial Intelligence And Product Researcher, SanLogic Mr. Thomas Valunith, Counsel Trilegal And this discussion will be moderated By the Board of Directors of the Indian Institute of Science And Product Researcher, SanLogic Mr. Nihar Desai, Head of JNI Thank you.

Nihar Desai

Hello. Hello. Am I audible? Okay. Thanks everybody for joining. So, I just delving right deep into it. My first question to you would be Mr. Nag. As we saw in the toolkit, we were arguing that data set like foundational data sets, speech data sets, must be treated as DPIs and DPGs and hence be available in general. From your experience in driving this ecosystem for about two years since I’ve been a part at least, what does it take to continue creation, ongoing facilitation of such innovations being put up as a digital public good while ensuring trust safety, right? And is there a way for us to have a flywheel of data sorts, data goods of sorts?

Amitabh Nag

Yeah, that’s a very important aspect of what we should be doing. That means continue the creation of data sets because it will then improve the models as we go by. Now, continuation of creation of data sets are, I would say that these are going to be in two or three ways, you know. One is the way which we have been… doing, which is the brute data collection, which is going to the various fields and then picking up the data from there and then creating the diversity which is required to actually build the model. So that is one way of doing it and that will continue. We will have to keep the focus with respect to saying that now I am doing for this particular area, this particular dialect, this particular language, while as it will be for other language in some other way.

The second is to actually look at using the products which have been developed using these models and creating such open domain activities to create the digital data. So you are creating the digital data which you are speaking, automatically creating the parallel corpus and then finding a way to actually vet this out and annotate and label and saying that, okay, this is the improvement corpus. That is the second thing. So one, you are creating a primary corpus. Second is… you are creating an improvement corpus which can be again fed back to the model and say that this is what is to be used and that is a big area of work as we look at. Allied to that is a lot of also the digital data is getting created any which way in the open domain which we can actually use to build the corpus again.

So you know YouTube videos today the world is more digital than it was yesterday. But the conscious way of looking at it as a program is what is required. How do I look at it as a program that I will be creating a data corpus at various places and this need not be necessarily an open domain. Open domain is kind of an easy way to work upon it. It can be a closed domain as well that there is an application which is working in an enterprise or a government and the people there are given an option to give suggestions to the translations or the answers or the things which you have gone in and that can get into a wetting pipeline and you are able to create that.

So those applications which are related to this when we are looking at AI portfolio not only languages but otherwise AI portfolio is very important for us to be on a continuous improvement journey. The most important aspect hence would be that if a person for example is working on a enterprise system of mails for example and it is actually deriving some summary of a document in perhaps a known language also or not a known language. The summary differs from what he thinks as a manual activity. He should be able to put that down somewhere and that goes as a feedback to the model. Currently that is something which is a concept which which may or may not exist, some enterprise would have done it, other enterprise would not have done it.

So looking at these kind of interventions which can be run as a program in a conscious way that everybody is able to contribute into the system his or her own things and then take it back from the, you know, improve the model or improve the AI systems, because they still require a lot of interventions from each and every person. The knowledge still is deficient. Thank

Nihar Desai

So what I’m taking away is that data sets need to be more of lived in nature. It’s not static. It has to be built upon by users and by others. And also just the fact that the feedback itself could lead to better data quality and which is something that enterprises might be doing, but it could definitely be done more. Thank you for that input. But to his point on the first question on data set inclusivity, Prashanta, like in going back to your research activity. mostly on inclusive data sets. The toolkit also argues that inclusivity must be designed at the foundational data layer at the time of designing data sets. But still we do find data sets which do lack this aspect.

What’s your take on what are the gaps over here at the research and academia level in terms of designing better inclusive data sets that could hence lead to better applications down the road?

Prasanta Ghosh

That’s a very deep and good question. So to cover the diversity and become more inclusive, one approach would be to cover in the data, right? But if we think about the diversity that is there in Indian languages, right, that is a function of the culture, caste, local knowledge and everything, right? And while we see the diversity, they are not independent elements. There are certain commonalities and certain uniqueness in each of these languages and dialects and accents that we talk about. So one important direction in modeling would be to think about this intrinsic basis components that finally leads to this diversity. Instead of a brute force way of covering data from all parts of the country.

So if you can discover, for example, just an example, I’m not an expert of linguistics, but if you look at the Indian languages, there are two broad, right? One is Indo -Aryan and the other is Dabirian. Now, while there are multiple languages within each of the streams, we may say, well, can we go and then to cater certain technologies? Two speakers of these languages. should we go ahead and collect a good amount of data in everything, each of those. That may not be the only way to think about. How do we balance and make a trade -off between the amount of data we collect? We know that’s challenging and costly as well, to a novel modeling where we start from those intrinsic basis components and then manifest into those individual diversities.

I think that may help us to jointly think about modeling and collection for catering to this diverse population.

Nihar Desai

If you could help the audience with one example of when you say balance both aspects. Let’s say if we could pick up one of your initiatives, Syspin, Respin or Wani or any other data set. How did you manage or balance inclusivity versus model building activities versus maybe other factors that might be coming into factor while designing specifications?

Prasanta Ghosh

Yeah, so the aspect of modeling that I brought out is something I would say not very well established at this moment. But from my experience in the project ResPin, I can give a concrete example. For example, if you take the Telugu as a language, right, there are, we worked with four major dialectal variations. One is in the region of Krishna Guntur, another is Vishakapatnam Vizag, another is Anandpur Chittoor, another is Nalgonda. Now, when you look at their intrinsic variations, we see that there are some commonalities. And then there are some unique aspects in each of those dialects. So now think about a brute force approach that I collect thousand hours in each of them. Versus think of collecting certain kind of stimuli to cover the actuality.

Acoustics case of the speakers, maybe from one region that will automatically cater to the other region. And then collect something that will complement. in each of the other regions, right? So that way, our overall timeline, budget, cost will all go down. And there has to be a novelty in terms of having a model that will start from the intrinsic one and then naturally diversify itself to cater to those populations. So that has to become a region -anchored approach that we started later on in one.

Nihar Desai

I see. Okay. Thanks for that input. Just to summarize, what I’m taking away is that instead of having brute force approach, what we’re essentially saying is balancing across various parameters on the basis of which you would train a model, such as linguistic diversity, acoustic diversity, and then using some sort of a smart approach to dissect the current audience, ways of collecting data, to maximize the output while maximizing bang for the buck. Thanks for that input. But this… This is also slightly… you are coming from the perspective of academia I would like to switch to Dr. Krithika from the perspective of as an applied AI researcher you are also one of the people in this panel who has really deployed speech AI solutions what is your take on challenges that you faced with inclusivity either at the data set layer or the application layer

Kritika K.R

More towards on core of the enterprise applications, knowledge repo integrations are coming up, beta healthcare, or even the manufacturing automobiles. So voice being the go -to interface for different applications and enabling the workforce across the industries is coming up. So in that case, again, as I said, on the consistency with the various user scenario and more specific to the domain adoption. Specialized domain adoption is required. That feedback loop is more important while the system is in the practice or while the system is in progress. I would say that point. And more critical aspect is on giving the scalable and sustainable infrastructure that comes with more optimized models and also like bringing the edge deployments also.

So that the real adoption can be scaled across multiple… industries and the normal usage for… various sectors across the industry. So I’m talking more on the end user perspective and using, getting the data. Data is one source of it, but making it reliable across the infrastructure and also giving the required scalable model at the device intelligence level is also important when it comes to the real adoption of these AI models.

Nihar Desai

Thanks for the input. So I guess after all, industry is also using feedback as a tool. It’s a nice validation over here. Yeah, maybe coming to Thomas, switching tracks to slightly legal sites. We’ve seen that, at least in the toolkit also, we’ve argued that speech models and speech data sets are at the intersection of copyright law, you know, data governance and security, etc. And how do you propose, how do you propose balancing sort of innovation? versus caution on these sites, especially with all the researchers and practitioners in the room?

Thomas J. Vallianeth

Thanks, Nihal. That’s, again, a very helpful question. I think Harleen had articulated it quite well in the beginning when we have to consider the entire ecosystem as a whole. There is a common myth in India that anything that is public is freely available. I think what we have to think about is also that, you know, all data sets operate at the intersection of privacy law and copyright law. Under privacy law, most publicly available data sets are essentially freely available to be used under, you know, even the new legislation. But under copyright law, even if it is publicly available, somebody else may own the copyright on that. So there has to be careful thought put in place right from the beginning itself in terms of what data sets you’re collecting, what is the copyright provenance of it, are you able to defer to, you know, freely licensed and open source kind of material to compile it, compile that data set, and if not, are you able to obtain the licenses to do so?

So the thought process from the beginning in terms of how you’re structuring the way to get this and also how to reduce the surface area of the impact of some of these laws. So for instance, in relation to privacy laws, if you’re collecting somewhat more private data sets, if you can use privacy enhancing technologies or you’re able to extract data such that no personal data is ultimately captured or stored at the point of data collection, all of these are various ways in which you can put in place mechanisms right from the start of when the ecosystem begins to ensure that downstream use cases are also protected in that sense. The second big aspect is, of course, the documentation, right?

Now, the data collector, the data creator is essentially the person who is the gateway to the entire ecosystem in some senses. The documentation has to be robust right from the beginning to enable everybody in the downstream chain to be able to use this data and to ensure that there’s a good and safe and trusted ecosystem created. with respect to that specific data set. So yes, there are flexibilities that are available under the law in terms of how you are able to use voice data sets, but at the same time, there’s some caution that you have to put in place right from the beginning and throughout the life cycle of this in terms of figuring out how to be able to use these data sets effectively.

Of course, the last kind of related aspect to this is to think about the various layers in which these legalities operate. So of course, you can think of the speech data set itself as being copyrighted, but equally, if they are reading out of a book passage or if they’re reading specific performance and so on, there may be separate rights that are allocated in relation to some of these other tangential elements as well. All of these are to be accounted for from the very beginning of the ecosystem itself such that downstream usage is not… in that sense impacted. So I would say, you know, the report’s argument in that sense is that think about it as a whole.

Don’t think of each action in isolation. Think about the entire impact downstream as well. And then account for both either enabling maneuvers under law in terms of documentation, privacy enhancing techniques and so on, or implement the appropriate cautionary mechanisms to ensure that downstream usage is also protected.

Nihar Desai

Yeah, at least in some of the hats that I wear, I am also collecting data sets and those are important points that we keep in mind. And hopefully we’ll be able to take the learnings out of toolkit to actually implement in our processes. Switching tracks slightly to Dr. Prashanta here, we’ve without measurement, right, we don’t really get anywhere in terms of implementing the right frameworks, implementing the right legal processes, etc., in terms of implementing, measuring quality. what you’ve also spoken about evaluations being broken as far as Indian context are concerned can you elaborate a little bit on what challenges we face on a day to day basis where do they come across and how do you foresee this sort of challenges either getting resolved or getting amplified again I think this is an important area that all of us together should explore and contribute to

Prasanta Ghosh

so when we build something like an automatic speech recognition system that is being used in many many applications think of this to be yet another human who is listening to the audio and trying to spit out what is spoken in text now if you go out in the real world as we have realized that multiple number of times and experienced through multiple projects in ResPin as well as Vani and many other projects that I have done and that we are able to do and that we are able to do and that we are able to do and that we are able to do is that if you give a piece of audio to two individuals, they never exactly agree on what they hear.

And I’m telling from my experience, not from two different parts of the country, I’m talking in terms of, you know, two people from the same district. In fact, there was an incident where we realized that these two people were just three kilometers away in terms of their location, but still they did not agree how that should be written from the audio they hear. So what it tells us is there is an inherent variation or variability in the way as an individual, as an Indian, I perceive or I like to see the text as, right? Now, if we accept that fact that exists today, we need to think of building our systems and system evaluation to cater to that variation.

So we need to think of that variability and to be… robust to that variability. So if, as I said in the beginning, if we treat the system also as a human, it will also not agree with another human. So if we just go by word -by -word comparison of how the system performs compared to some of the humans, certainly it will not be 100 % accurate. Or in other words, we calculate using what we call word error rate, which is objective way of evaluating. So a word -based comparison is not probably the right way to go at this point. Maybe the ASR system is doing pretty well, but just because it made a mistake slightly in one of the words, we are penalizing and telling that it’s not doing well.

So now we have to think about how do we solve this problem. It could be that we have a multiple evaluation system where we just don’t use word error rate. That’s one aspect. Another way to think about this will be to build ASR so that it itself can give not just one output, rather multiple outputs. which could be potentially right and then evaluate that not just objectively but also subjectively through human because human can absorb that error and say yes still it’s okay third will be to take that to the downstream application where depending on what you are using could be an LLM or any other QNA system that can absorb that robustness so I think we need to break down the entire evaluation system into multi -layered evaluations and then they are not really independent we need to take feedback all the way down to the final application back to ASR and so on so forth so I guess here individuals from the application areas, individuals from the linguistic background, engineers everyone has to come together and

Nihar Desai

so what I am hearing is that to solve this is more of an ecosystem level challenge right and then And maybe before our ecosystem champion over here, Mr. Nag, before you come in on this, I would just like one industry perspective of Dr. Krithika, how do you solve this from an application standpoint? Prashanta explained this challenge from more of an academic or foundational research standpoint. But how does evaluation play a role in your daily application layer?

Kritika K.R

Yeah, so as I said, right, so the applications are varied. So now the adoption is at the conversational level, right from bringing the analytics out of the data. Then now it is more on the voice interface and the multilingual conversation. Now with the speech -to -speech translation, those things are more prevalent with the conversation right now. Now coming to the industry application, industry aspect of it, yeah, adopting these models to the custom data set is one way. And also right pick of sourcing the data. From the available open source so that this model will be more specialized to those particular tasks. and the work they are supposed to do it. So now coming with the LLMs, these models are more adaptable to the industry jargons or even the core of the industry workflow.

Now making AI with the ASR models also enabling with the LLM, you have various methods from the data creation perspective, leveraging the open source data, and also like custom tuning the data to the various industry use cases. Definitely with the required compliance and these open source models are also enabling the on -prem deployment of these models, which enables the security aspect when it comes to creating the model for different core industry applications so that the models can be much more fine -tuned or trained across the domain, keeping the compliance aspect and the security aspects intact.

Nihar Desai

so having heard both of these perspectives Mr. Nag, how do you just from your experience standpoint, how do we approach resolving this conflict where all of us sort of concur that evaluations need we need a better framework to evaluation but it’s also in some ways nobody’s problem at the moment so is there a way to break this

Amitabh Nag

so let’s step back and let’s evaluate our conversation itself you know, is there a framework by which we can say that who was saying, who has spoken better language right, it was as good as other people understand it you know, if the audience is able to understand what I am speaking and what I am intending to speak that is what is going to be the final evaluation by any aspect. What we have to actually look at it is that we have to reach a level by which it is acceptable to the people who are sitting in front of me. I don’t think we will be able to ever reach a situation where we will be able to say that this is the best, second best, third best.

It is a situation, ultimately the audience decide whether they are in a position to do that. We are looking at few of the use cases where we have actually deployed these technologies and we incidentally, you know, go to various evaluations. One of them is grievance incidentally and when we were giving it to the last, to the person who is actually the owner of the system, the acceptance was supposed to be taken up by various ministries. So one ministry would say that this model is better. The other ministry would perhaps display. It’s a question of perception and ultimately the audience would decide. And some would like the tone of speaking, some would like the modality, some would like the pronunciation.

So it’s all based on what the person’s perception is. Now, is there a common way in which we can say that this is the acceptable thing? Right? But then also we will have differences. You know, many of the public figures, for example, when they speak, you know, Hindi or English or whatever language, there are gaps in the language, but still, you know, they are understood. They are able to connect to the people. So we have a difficult challenge. Rather than looking at it only from a perspective of application or academics, we would have to look at it from a perspective of audience. But then we also have some issues. You know, we have situations where we have a lot of people who are not aware of the situation.

And we have to look at the situation from a perspective of the audience. which require accurate and perfect transcriptions. Like, for example, if I’m arguing a case in a court, you know, I can’t have variations in terms of languages. If I am, for example, trying to be in a meeting where I am saying something, again, I cannot have variations. But for that also, we will perhaps have to step two steps back and look at purity of language with respect to the acceptance. Because most of our language has become impure because of the fact that we are, you know, using mixed code most of the time, especially in the cosmopolitan area. And in the other areas, even if we are having native language, dialects are taking over.

So it’s a very complex problem. It’s not an easy problem to solve. At this point in time, when we are looking at how do we actually take it forward, I would tend to say that we should look at what is acceptable to the audience and then start working back to define an acceptable way by which in which the models can go out in the market.

Nihar Desai

Yeah, that’s an important point that so far we’ve been looking at mostly, at least I have been looking at mostly from the lens of application versus academia, but maybe we need to go what works point of view and not really from just the traditional ranking point of view. But Thomas, in a world where, and this is, we’ve not talked about this and this might be a curveball, but in a world where evaluation is slightly subjective and no longer objective, how does law see this? How do you make decisions for procurement? How do you resolve arguments, differences between two opinions, and especially in cases where both might be right and it’s a gray area? Like, do you foresee these sort of scenarios coming in, especially with Gen AI, which is like…

Like, do you foresee these sort of scenarios coming in, especially with Gen AI, to be fair I think the legal principles at least on this are somewhat more clear at least in terms of some of the more privacy facing or copyright facing principles they occur much before outputs for instance are produced or any of these methodologies are implemented and we have a body of law that existed for many years in India it’s just a question of how do you lead evidence in relation to some of these matters so if it ever comes to the question of is a specific output right or is a specific output implying this or a specific output implying that I think where we haven’t caught up as a country is in terms of how to evaluate the evidentiary standard in relation to that the principles of course are fairly laid out saying that this is how you would decide it but what you would show the court to say this is the evidence for that that’s something I think that’s still evolving but I think it also brings me to I think a larger point and I think we’re making on the in the report as well is that you know there is a measure of trust that needs to be put in place in the ecosystem as a whole, right?

Irrespective of what the outcome of evaluation may be, there are measures that you can put in place right from the get -go. And one example I can give you is in relation to harmful content, right? Now, if there is a debate in relation to whether content is harmful or not, and it is a subjective determination, you can avoid that question to some degree by putting in place the necessary rails and safeguards right from the beginning itself so that trust is engineered into the process already as opposed to having to face that choice kind of downstream. But yes, to your point, and if we’re coming to a place where we need to face that question, I think the principles exist, but how you lead evidence, how you show the court that one is the interpretation over the other, still developing and very, very subjective.

I think some of the cases that, you know, the prominent AI players have in the country will go a long way to develop. Some of those standards, but at least as of now, the court system is still trying to catch up. to some of these principles. Documentation goes a long way to show intent. Methodologies that you have implemented that go to the extent of showing that you assumed reasonably high enough safeguards, reasonably high enough principles. All of these go a large extent to show intent. And so the subjectivity, I think, in that sense is far reduced if you put in place some of these measures that bring trust in the entire ecosystem. So I think at that one flashpoint of failure perhaps is tough to look at for the courts as well.

But if you look at it from an ecosystem perspective, I think there’s a lot of that that may reduce those flashpoints of failure or those flashpoints of evaluation at least from a legal perspective.

Thomas J. Vallianeth

I see. Thanks for that summarization. That the law as such is at a stage where it can accommodate some amount of subjectivity but there needs to be dialogue and more policy decisions to make it crisper and of course follow on into the application of the law. Thanks for that input. last question is leaving the floor open in terms of any inputs we do have the topic at hand is challenges and best practices for speech models and data sets at the ecosystem level or from your experiences any open points any arguments that you would like to make or any sort of a call out that you would like to make to the ecosystem right here it means the call out is means like you know many of the things which were indeterministic or unknown a few days back have started coming into a situation where we are able to crystallize it so I think we need to get into more workshops more discussions to think about it as how to do it take more use cases study more use cases in detail to figure out a framework by which acceptability and evaluations are properly benchmarked.

That’s a good point. Go ahead, Thomas. I have a point to add here, which is, you know, I think there is a certain sense of affinity in this ecosystem towards open source data sets or open models. I would be more thoughtful in terms of how and when these are suitable. Are there particular safeguards you need to put in place for open source data sets is something you need to think about. Are there end -use considerations that need to be tailored? And a good example is, you know, we have, I’ve seen an example where somebody is training a model to detect hate speech, right? Now the safeguards you would put in place to detect hate speech in a model is different from a data set and a model that you would develop to detect regular speech -to -speech translation.

So the decision as to what licensing frameworks, what documentation frameworks are fairly, need to be informed by what end -use case you’re doing, what unique… Thank you. attributes arise as a result of the specific data sets and applications that you are considering and finally on the basis of what downstream users you are expecting the choice needs to be made I think in a little bit more of a conscious fashion Nishant you wanted to say sure this your question actually stimulates me to think about you know English I mean you know sort of models that were built on American English so there have been always a standardization on evaluation in fact NIST evaluation if you look at there have been various protocols and there have been call out every year who beats the best baseline so far achieved I believe we have to do in our country in India at least for Indian languages and it’s very diverse as we just discussed so first of all thinking about how to evaluate and then creating a national level framework for evaluation.

And every year, let’s assess ourselves, all these stakeholders, right?

Prasanta Ghosh

It could be general evaluation, could be application specific in each language or dialect. And then we really have a leaderboard, which, of course, you know, there are many individual leaderboards across the country, but let’s have only one under Varshini, let’s say, right? And that should be elaborate enough to cater to all languages and dialects. And maybe that’s not the right way, but you think through and make sure every year we make progress in each of those three. I think that has to be brought in in the system to bring competitiveness in a collaborative way, of course. And overall, that can help improve the voice technology in Indian languages. And the reason I’m saying it

Nihar Desai

mostly from my understanding and experience with the English that has happened, in the past. Yeah. interesting points, Prasanta, in terms of I hear you sort of speak passionately about evaluation and now you’re taking it one step further in terms of how do we really create a unified framework for evaluation within competitive but yet collaborative manner for the ecosystem housed under a central, unpartial entity like Bhashani. This is a great point. I hope the audience found some of these points helpful and enriching. Thank you so much for making time in what is sure to be a very busy event and hope you have a rest of a good day. Thank you. Invite Mr. Shailendra Pal Singh, Senior General Manager, Bhashani to felicitate the speakers.

Thank you. Mr. Amitabh Nag Dr. Prasanta Ghosh Dr. Krithika K .I. Mr. Thomas Salenat I’m Ms. Harleen Kaur Thank you to all our speakers for walking us through this rich tapestry of voice technologies and their life cycle in the Indian context and we hope you read our report and the toolkit and find it useful. Thank you so much. Thank you so much to the audience for staying with us patiently throughout this entire hour. Thank you. Thank you.

Related ResourcesKnowledge base sources related to the discussion topics (32)
Factual NotesClaims verified against the Diplo knowledge base (5)
Confirmedhigh

“Diversity of people, languages and cultures makes inclusion a core design requirement rather than an after‑thought”

The knowledge base stresses that diversity of languages, cultures and people is essential for inclusive AI systems, as noted in [S10] and reinforced by Yann LeCun’s comment on the need for multilingual training in [S101].

Confirmedhigh

“Voice AI is a gateway for low‑literacy populations to access public services, health care, education and economic participation, and failure to provide multilingual voice interfaces can reinforce exclusion”

Multiple sources describe multilingual voice AI as a way to bridge digital exclusion and serve low-resource users, e.g., the discussion on multilingual AI bridging gaps in [S73] and the emphasis on voice-driven multilingual interfaces for equity in [S113].

Confirmedmedium

“The initiative is linked to the Hamburg Declaration on Responsible AI for the Sustainable Development Goals”

The Hamburg Declaration on Responsible AI for the SDGs is documented in [S17], confirming the report’s reference to this framework.

Additional Contextmedium

“The policy report and developers toolkit are a product of a German‑Indian partnership”

Broader context on Indo-German AI collaboration is provided in [S111] and the German-Asian AI partnership overview in [S108], which illustrate the existence of such bilateral initiatives.

Additional Contextlow

“Institutionalising sustainable open‑source infrastructure is a pillar of the policy framework”

The importance of open-source solutions for governments in the Global South is highlighted in [S104], adding nuance to the report’s emphasis on open-source infrastructure.

External Sources (118)
S1
EQUAL Global Partnership Research Coalition Annual Meeting | IGF 2023 — Ariana is an aerospace engineer and technology policy specialist who is passionate about creating gender-inclusive innov…
S3
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — -Nihar Desai- Head of JNI, Panel Discussion Moderator
S4
IGF Retrospective – Past, Present, and Future — – **Nitin Desai** – Role/Title: Former MAG chair (approximately 5 years), chaired the working group on Internet governan…
S5
Keynote-Olivier Blum — -Moderator: Role/Title: Conference Moderator; Area of Expertise: Not mentioned -Mr. Schneider: Role/Title: Not mentione…
S6
Keynote-Vinod Khosla — -Moderator: Role/Title: Moderator of the event; Area of Expertise: Not mentioned -Mr. Jeet Adani: Role/Title: Not menti…
S7
Day 0 Event #250 Building Trust and Combatting Fraud in the Internet Ecosystem — – **Frode Sørensen** – Role/Title: Online moderator, colleague of Johannes Vallesverd, Area of Expertise: Online session…
S8
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — -Kritika K.R.- Head Artificial Intelligence and Product Researcher, SanLogic
S9
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — -Prasanta Ghosh(Dr. Prasanta Ghosh) – Associate Professor at the Indian Institute of Science
S10
https://dig.watch/event/india-ai-impact-summit-2026/digital-democracy-leveraging-the-bhashini-stack-in-the-parliamen — Thank you. Mr. Amitabh Nag Dr. Prasanta Ghosh Dr. Krithika K.I. Mr. Thomas Salenat I’m Ms. Harleen Kaur Thank you to all…
S11
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — -Thomas J. Vallianeth(Thomas Valunith/Thomas Salenat in transcript) – Counsel, Trilegal
S12
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — – Thomas J. Vallianeth- Harleen Kaur
S13
Inclusive AI_ Why Linguistic Diversity Matters — -Amitabh Nag- CEO of Bhashini
S14
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — – Kritika K.R.- Amitabh Nag – Prasanta Ghosh- Amitabh Nag
S15
Internet standards and human rights | IGF 2023 WS #460 — Furthermore, there is a pressing need for equal access and inclusion in standard-setting bodies, particularly for civil …
S16
Digital democracy and future realities | IGF 2023 WS #476 — Communities continue to build their own tools and generate content, but they face difficulties in gaining a strong footh…
S17
Hamburg Declaration champions responsible AI — TheHamburg Declaration on Responsible AI for the Sustainable Development Goals (SDGs)is a new global initiative jointly …
S18
Day 0 Event #189 Toward the Hamburg Declaration on Responsible AI for the SDG — – CLAIRE: No role/title mentioned – THIAGO MORAES: Works at the Brazilian Data Protection Authority and PhD researcher …
S19
Multistakeholder Partnerships for Thriving AI Ecosystems — And as I mentioned at the beginning, one of the things that we have been doing with the, as part of the Hamburg Sustaina…
S20
WS #254 The Human Rights Impact of Underrepresented Languages in AI — 3. Facilitating Dialogue: Creating platforms for knowledge sharing and discussion among diverse stakeholders. Gustavo F…
S21
Opportunities of Cross-Border Data Flow-DFFT for Development | IGF 2023 WS #224 — ATSUSHI YAMANAKA:Thank you, Mineta-san. Actually, that’s actually a nice segue into the public goods discussions. I thin…
S22
Global Internet Governance Academic Network Annual Symposium | Part 1 | IGF 2023 Day 0 Event #112 — However, this policy approach has sparked substantial critique for its disregard of other significant aspects of data ac…
S23
Keynote by Sangita Reddy Joint Managing Director Apollo Hospitals India AI Impact Summit — “These health systems of the future connect public and private, connect primary care with advanced care, connect researc…
S24
https://dig.watch/event/india-ai-impact-summit-2026/building-public-interest-ai-catalytic-funding-for-equitable-compute-access — And here, India is not waiting for permission. India is not waiting for permission. India is showing that it can be done…
S25
Bridging the Digital Divide: Inclusive ICT Policies for Sustainable Development — ## Dr. Rahman’s Three-Pillar Framework for Inclusive ICT Policies ### Policy Recommendations ### Three-Pillar Policy F…
S26
Leveraging AI4All_ Pathways to Inclusion — The report identified three interconnected pillars essential for inclusive AI: design, access, and investment. The desig…
S27
Transforming Agriculture_ AI for Resilient and Inclusive Food Systems — So bridging these gaps, which should be a priority for all of us, requires investment in connectivity and other digital …
S28
Connecting open code with policymakers to development | IGF 2023 WS #500 — Conversely, the potential negative effects of open source were also discussed. The speakers raised concerns regarding th…
S29
Dynamic Coalition Collaborative Session — Eleni argues that inclusive OER ecosystems need backing by institutional frameworks, funding and standards, including op…
S30
Inclusive AI For A Better World, Through Cross-Cultural And Multi-Generational Dialogue — AI policies in Africa should ideally espouse a context-specific and culturally sensitive orientation. The prevailing ten…
S31
WS #288 An AI Policy Research Roadmap for Evidence-Based AI Policy — Isadora Hellegren: Thank you so much, Tatiana. It is really a true pleasure to be here with all of you today. And before…
S32
Open Forum #64 Local AI Policy Pathways for Sustainable Digital Economies — ## Introduction and Context Abhishek Singh: Thank you for convening this and bringing this very, very important subject…
S33
Main Session | Dynamic Coalitions — Tatevik Grogryan: I would like to start by saying that we have a number of stakeholders in this cluster, the first one o…
S34
Open Forum #29 Advancing Digital Inclusion Through Segmented Monitoring — Fabio Senne: No, yes, I agree with this discussion of the cycle. It’s interesting because if you take, there’s a very st…
S35
Operationalizing data free flow with trust | IGF 2023 WS #197 — In summary, data flow is fundamental to our modern society, as it underpins almost all aspects of our lives. Establishin…
S36
Catalyzing Cyber: Stimulating Cybersecurity Market through Ecosystem Development — Cybersecurity plays a critical role in protecting strategic companies and assets from daily attacks. Saudi Arabian Milit…
S37
WS #484 Innovative Regulatory Strategies to Digital Inclusion — Strong consensus exists on core challenges (coverage vs. meaningful access, device affordability, need for skills) and t…
S38
High-level AI Standards panel — Paul Gaskell: Thank you, Bilel. So, I mean, as a government, we recognize that digital standards really matter. So we’re…
S39
AI That Empowers Safety Growth and Social Inclusion in Action — Multi-layered approach is needed including model requirements, application testing, executive review, and post-launch mo…
S40
Risks and opportunities of a new UN cybercrime treaty | IGF 2023 WS #225 — Lastly, inclusive involvement of the technical community in the policy-making process is advocated. The technical commun…
S41
Dedicated stakeholder session (in accordance with agreed modalities for the participation of stakeholders of 22 April 2022)/OEWG 2025 — Singapore emphasizes the need for capacity building efforts targeted at the leadership level. They argue that such progr…
S42
PERMANENT MISSION OF THE REPUBLIC OF SINGAPORE UNITED NATIONS NEW YORK — – b) Emphasizing that there is no one-size-fits-all solution to capacity-building, States proposed that efforts to tailo…
S43
Operationalizing data free flow with trust | IGF 2023 WS #197 — It emphasizes the need for balance, regulation, and global alignment to ensure that data flows are both efficient and se…
S44
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — But then when we move on to the, hosting and licensing aspect, long -term infrastructure costs, costs, governance of ope…
S45
https://dig.watch/event/india-ai-impact-summit-2026/digital-democracy-leveraging-the-bhashini-stack-in-the-parliamen — But then when we move on to the, hosting and licensing aspect, long -term infrastructure costs, costs, governance of ope…
S46
WS #106 Promoting Responsible Internet Practices in Infrastructure — This comment broadened the stakeholder discussion to include the open source community as a critical but often invisible…
S47
Connecting open code with policymakers to development | IGF 2023 WS #500 — Conversely, the potential negative effects of open source were also discussed. The speakers raised concerns regarding th…
S48
Digital Cooperation and Empowerment: Insights and Best Practices for Strengthening Multistakeholder and Inclusive Participation — The discussion revealed growing recognition that complex challenges require coordinated responses from multiple stakehol…
S49
How AI Drives Innovation and Economic Growth — High level of consensus across diverse perspectives (World Bank, academia, legal scholarship, development practice) sugg…
S50
Strengthen Digital Governance and International Cooperation to Build an Inclusive Digital Future — These key comments fundamentally shaped the discussion by providing concrete frameworks for understanding abstract chall…
S51
Lightning Talk #7 Privacy Redefined: equitable Access in the AI Age — Low to moderate disagreement level. The speakers generally aligned on identifying problems but differed on solutions and…
S52
The Power of the Commons: Digital Public Goods for a More Secure, Inclusive and Resilient World — – Integrating DPGs into broader policy discussions on climate change, education, and healthcare Alicia Buenrostro Massi…
S53
Digital divides & Inclusion — Discussion on whether internet should be a human right or a public good In terms of online content, the importance of l…
S54
AI as a tech ally in saving endangered languages — For this reason, language technology should be treated as public infrastructure. Not as a symbolic cultural initiative, …
S55
Contents — 1. Policy-makers could clearly identify intended objectives (e.g. to improve data privacy and ensure proper collection o…
S56
A Primer — –  data creation, –  collection, –  organization, and –  use. such example is drone swarms which are ‘made up of co…
S57
Policies and platforms in support of learning: towards more coherence, coordination and convergence — 137. UNHCR has established a centralized systematic learning centre overseeing all learning solutions across th…
S58
Qatar’s Open data policy — The scope of the policy includes all government entities that create, store, or manage data and information. It requires…
S59
Exploring Digital Transformation for Economic Empowerment in Africa: Opportunities, Challenges, and Policy Priorities (International Trade and Research Centre, Nigeria) — Currently, there is a lack of metrics to evaluate the impact of policies in the policy space. It is important to develop…
S60
Presentation of outcomes to the plenary — This aligns with SDGs 13 and 14, which call for climate action and the conservation of marine life. Overall, the compreh…
S61
Pre 2: The Council of Europe Framework Convention on AI and Guidance for the Risk and Impact Assessment of AI Systems on Human Rights, Democracy and Rule of Law (HUDERIA) — Jordi Ascensi-Sala focused on the practical implementation of the HUDERIA methodology, which bridges the gap between leg…
S62
Open Forum #30 High Level Review of AI Governance Including the Discussion — Lucia Russo: Thank you, Yoichi. Good morning and thank you my fellow panelists for this interesting discussion. As Yoich…
S63
EU Artificial Intelligence Act — 1. Detailed description of the evaluation strategies, including evaluation results, on the basis of available public eva…
S64
Report by the Commission on the Measurement of Economic Performance and Social Progress — – 29) The information relevant to valuing quality of life goes beyond people’s self-reports and perceptions to include…
S65
Driving Social Good with AI_ Evaluation and Open Source at Scale — However, audience questions revealed tension between this contextual approach and institutional needs for standardizatio…
S66
AI & Child Rights: Implementing UNICEF Policy Guidance | IGF 2023 WS #469 — This objective evaluation approach eliminates bias and subjectivity that may arise from teachers’ individual assessment …
S67
Global Standards for a Sustainable Digital Future — Dimitrios Kalogeropoulos: Yeah, hello, everyone. Forgive me, but I will read. So the title for me today is Building Brid…
S68
Open Forum #75 Shaping Global AI Governance Through Multistakeholder Action — Devine Salese Agbeti: Thank you. Firstly, we have to align AI with international human rights standards. In that, for ex…
S69
WS #110 AI Innovation Responsible Development Ethical Imperatives — Ricardo Israel Robles Pelayo: Thank you very much. Good afternoon, everyone. It is an honor to be here and share a refle…
S70
Open Forum #64 Local AI Policy Pathways for Sustainable Digital Economies — ### Community-Led Development Abhishek Singh: One part is that, of course, the way the technology is evolving, there is…
S71
Opportunities of Cross-Border Data Flow-DFFT for Development | IGF 2023 WS #224 — This implies that active engagement and participation from individuals are key factors in driving meaningful discussions…
S72
Digital Democracy Leveraging the Bhashini Stack in the Parliamen — Implement layered data strategies using multiple sources (active collection, passive collection, synthetic data) rather …
S73
How Multilingual AI Bridges the Gap to Inclusive Access — Moving beyond the initial 22 constitutional languages to serve broader linguistic diversity requires scalable data colle…
S74
Open Forum #29 Advancing Digital Inclusion Through Segmented Monitoring — Fabio Senne: No, yes, I agree with this discussion of the cycle. It’s interesting because if you take, there’s a very st…
S75
AI That Empowers Safety Growth and Social Inclusion in Action — Thank you very much, Peggy, and thanks for having Microsoft here. So, yeah, I want to start with the inception of our re…
S76
Day 0 Event #173 Building Ethical AI: Policy Tool for Human Centric and Responsible AI Governance — Chris Martin: Thanks, Ahmed. Well, everyone, I’ll walk through I think a little bit of this presentation here on what…
S77
Operationalizing data free flow with trust | IGF 2023 WS #197 — In summary, the fear of government access to data poses a threat to the free flow of data with trust. Microsoft’s statis…
S78
Expert workshop on the right to privacy in the digital age — The perspective of Internet service providers (ISPs) was provided byMrMike Silber, head of the legal and commercial depa…
S79
Day 0 Event #250 Building Trust and Combatting Fraud in the Internet Ecosystem — Emily argues that privacy and criminal justice are not in opposition but can coexist within proper legal frameworks. She…
S80
High-level AI Standards panel — Amandeep Singh Gill reinforced this perspective by advocating for multidisciplinary approaches that embrace socio-techni…
S81
Advancing Scientific AI with Safety Ethics and Responsibility — Evaluation must go beyond model‑centric metrics to include institutional practices, DIY science, and broader socio‑techn…
S82
World Economic Forum Annual Meeting Closing Remarks: Summary — The tone is consistently positive, celebratory, and grateful throughout the discussion. It begins with formal appreciati…
S83
Opening of the session — The tone began very positively and constructively, with the Chair commending delegations for focused, specific intervent…
S84
Summit Opening Session — The tone throughout is consistently formal, diplomatic, and collaborative. Speakers maintain an optimistic and forward-l…
S85
AI for food systems — The tone throughout the discussion was consistently formal, optimistic, and collaborative. It maintained a ceremonial qu…
S86
Panel 5 – Ensuring Digital Resilience: Linking Submarine Cables to Broader Resilience Goals — This comment emphasizes the critical importance of collaboration while also pushing for concrete actions rather than jus…
S87
Multistakeholder digital governance beyond 2025 — The discussion maintained a constructive and collaborative tone throughout, with speakers sharing both challenges and su…
S88
Law, Tech, Humanity, and Trust — The discussion maintained a consistently professional, collaborative, and optimistic tone throughout. The speakers demon…
S89
HETEROGENEOUS COMPUTE FOR DEMOCRATIZING ACCESS TO AI — The discussion maintained a professional, collaborative, and optimistic tone throughout. Panelists demonstrated mutual r…
S90
Bridging the Digital Divide: Inclusive ICT Policies for Sustainable Development — The discussion maintained a formal, academic tone throughout, characteristic of a research presentation or conference se…
S91
Session — The tone was primarily analytical and forward-looking, with the speaker presenting evidence-based predictions while ackn…
S92
Impact & the Role of AI How Artificial Intelligence Is Changing Everything — The discussion maintained a cautiously optimistic tone throughout, balancing enthusiasm for AI’s potential with realisti…
S93
Transforming Agriculture_ AI for Resilient and Inclusive Food Systems — The tone was consistently optimistic yet pragmatic throughout the conversation. Speakers maintained an encouraging outlo…
S94
AI as critical infrastructure for continuity in public services — The discussion maintained a collaborative and constructive tone throughout, with participants building on each other’s p…
S95
Parliamentary Closing Closing Remarks and Key Messages From the Parliamentary Track — The discussion maintained a collaborative and constructive tone throughout, characterized by diplomatic language and mut…
S96
Closing remarks – Charting the path forward — The tone throughout was consistently formal, diplomatic, and optimistic. It maintained a collaborative and forward-looki…
S97
Building Inclusive Societies with AI — The discussion maintained a constructive and solution-oriented tone throughout, characterized by: The tone remained con…
S98
Safeguarding Children with Responsible AI — The discussion maintained a tone of “measured optimism” throughout. It began with urgency and concern (particularly in B…
S99
Global dialogue on AI governance highlights the need for an inclusive, coordinated international approach — Global AI governance was the focus of a high-levelforumat the IGF 2024 in Riyadhthat brought together leaders from gover…
S100
Open Forum #33 Building an International AI Cooperation Ecosystem — Participant: ≫ Distinguished guests, dear friends, it is a great honor to speak to you today on a topic that is reshapin…
S101
Debating Technology / Davos 2025 — Yann LeCun: Well, I think the answer to this is diversity. So, again, if you have two or three AI systems that all com…
S102
Keynote ‘I’ to the Power of AI An 8-Year-Old on Aspiring India Impacting the World — 8 year old prodigy: Sharing is learning with the rest of the world. One, an AI that is independent. From large global A…
S103
Agenda item 6: other matters — Chair: Thank you very much, France, for your statement. Well, thank you also for the confidence-building measure of sp…
S104
https://dig.watch/event/india-ai-impact-summit-2026/democratizing-ai-building-trustworthy-systems-for-everyone — I think open source is going to be in my mind a critical aspect of it. You’ll have to see how far open source movement t…
S105
https://dig.watch/event/india-ai-impact-summit-2026/from-innovation-to-impact_-bringing-ai-to-the-public — So you are saying that when you make a financial decision, when financial industry or system makes a decision, there may…
S106
DPI High-Level Session — Dr. Yolanda Martinez:heat for those at ITU, and I would like to welcome you, WSIS multistakeholders, DPI ecosystem, and …
S107
Planetary Limits of AI: Governance for Just Digitalisation? | IGF 2023 Open Forum #37 — Martin Wimmer:Thank you. Yesterday morning, I went to Ryoen-Chi. This World Heritage Site in Kyoto and yours is one of t…
S108
GermanAsian AI Partnerships Driving Talent Innovation the Future — Ms. Kofler, please come up. There’s no signs. You can choose in the middle. Next panelist, I would really warmly welcome…
S109
Launch of the eTrade Readiness Assessment of Mauritania (UNCTAD) — It is supported by the financial contribution from GIZ on behalf of the German Federal Ministry for Economic Cooperation…
S110
Digital Trade for Development — In summary, the future of trade is digital, with services, green practices, and inclusivity driving its growth. The expa…
S111
IndoGerman AI Collaboration Driving Economic Development and Soc — Building confidence and security in the use of ICTs | Data governance | Artificial intelligence India’s demographic div…
S112
https://dig.watch/event/india-ai-impact-summit-2026/indogerman-ai-collaboration-driving-economic-development-and-soc — And circular economy. that government, academia, and industry work hand -in -hand. By promoting research and development…
S113
AI for Bharat’s Health_ Addressing a Billion Clinical Realities — “It can deal with multilinguality and voice.”[51]. “There’s firstly a lot of opportunity to bridge some of these inequit…
S114
tABle of Contents — rs, including improved health care, better education, access to a greater number of economic opportunities and greater c…
S115
WS #144 Bridging the Digital Divide Language Inclusion As a Pillar — Manal Ismail: Thank you, Ram, and from a government perspective, of course, truly multilingual internet is crucial for d…
S116
Digital Policy Perspectives — The strategy advocates for democracy, rights-respecting policies, and inclusivity across the digital landscape. The stra…
S117
Ministerial Roundtable — Rashad Nabiyev: We can – thank you, thank you. So here we – According to the alphabetical order, so we start with Azerba…
S118
Secure Finance Risk-Based AI Policy for the Banking Sector — Ajay Kumar Chaudhary opened by highlighting India’s opportunity to lead in AI development while managing associated risk…
Speakers Analysis
Detailed breakdown of each speaker’s arguments and positions
A
Ariane Ahildur
3 arguments126 words per minute562 words266 seconds
Argument 1
Inclusive Voice as Public Good
EXPLANATION
Voice technology can serve as a natural interface for millions of people who have limited literacy or lack access to conventional digital devices. When voice AI operates in local languages and dialects it unlocks public services, healthcare, education and economic participation, whereas failure to do so risks deepening exclusion.
EVIDENCE
She highlighted that voice is the most natural and powerful interface for those with limited literacy or device access, and that local-language voice AI becomes a gateway to essential services, while its absence can reinforce exclusion [34-38]. She also stressed that responsible, inclusive voice AI is a shared vision beyond a mere technical issue [39-41].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Voice technology is highlighted as a gateway to digital inclusion for low-literacy users and as a public good in the Bhashini discussion and IGF standards dialogue, and the Hamburg Declaration reinforces its role for sustainable development [S2][S15][S17][S20].
MAJOR DISCUSSION POINT
Inclusive Voice as Public Good
AGREED WITH
Amitabh Nag, Harleen Kaur, Nihar Desai
Argument 2
Responsible AI Aligned with Hamburg Declaration
EXPLANATION
The report aligns its responsible AI principles with the Hamburg Declaration, which calls for AI that serves people and the planet, strengthens inclusion and supports the Sustainable Development Goals. This positions the Indo‑German partnership as a model of cooperation rather than competition.
EVIDENCE
She referenced the Hamburg Declaration on Responsible AI for Sustainable Development Goals, noting its endorsement by over 50 stakeholders and its emphasis that AI should serve people, the planet, inclusion and sustainable development [49-52].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The Hamburg Declaration explicitly frames responsible AI for the SDGs and is cited as the reference point for aligning AI principles in the session [S17][S18][S19].
MAJOR DISCUSSION POINT
Responsible AI Aligned with Hamburg Declaration
Argument 3
Open‑source voice models for nine Indian languages empower diverse stakeholders
EXPLANATION
Ahildur highlights that the Fair Forward initiative has released open voice technologies covering nine Indian languages, which can be freely used by NGOs, state agencies, and companies to build inclusive applications.
EVIDENCE
She states that Fair Forward created open voice technologies for nine Indian languages that can now be used by NGOs, state agencies, and companies [43-45].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The release of open-source voice models covering nine Indian languages is documented in the Bhashini panel, and India’s public compute infrastructure is presented as an enabling backdrop [S2][S24].
MAJOR DISCUSSION POINT
Open‑source multilingual voice models as public assets
AGREED WITH
Amitabh Nag, Harleen Kaur
N
Nihar Desai
3 arguments131 words per minute1767 words804 seconds
Argument 1
Data as Digital Public Good
EXPLANATION
Foundational speech datasets should be treated as Digital Public Goods (DPIs/DPGs) so that they are openly available for reuse and innovation. This requires mechanisms that ensure trust, safety and continuous enrichment of the data.
EVIDENCE
He asked whether foundational speech datasets can be treated as DPIs/DPGs and made generally available, emphasizing the need for trust and safety in their creation and use [118-119]. He later summarized that datasets must be “lived-in” and continuously improved through user feedback rather than remaining static [146-148].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Foundational speech datasets are advocated as Digital Public Goods, with calls for trust, safety and continuous enrichment appearing in the policy brief and cross-border data-flow discussions; a contrasting view notes some jurisdictions still treat data purely as private assets [S2][S21][S15][S22].
MAJOR DISCUSSION POINT
Data as Digital Public Good
AGREED WITH
Ariane Ahildur, Amitabh Nag, Harleen Kaur
Argument 2
Flywheel Model for Ongoing Dataset Enrichment
EXPLANATION
A sustainable ecosystem should create a virtuous flywheel where data collection, model improvement and user feedback continuously reinforce each other. This loop keeps datasets fresh and models increasingly accurate.
EVIDENCE
He posed the question about establishing a flywheel for data goods, asking how ongoing creation and facilitation can be achieved while ensuring trust and safety [117-120]. His later remarks about data needing to be “lived-in” and built upon by users echo this flywheel concept [146-148].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The concept of a data-creation flywheel is outlined in the Bhashini session and reinforced by the health-AI summit’s description of a self-reinforcing data loop [S2][S23].
MAJOR DISCUSSION POINT
Flywheel Model for Ongoing Dataset Enrichment
AGREED WITH
Amitabh Nag, Harleen Kaur, Prasanta Ghosh
Argument 3
Policy Framework for Sustainable Open‑Source Infrastructure
EXPLANATION
The proposed policy framework rests on four pillars, one of which institutionalises sustainable open‑source infrastructure and standard‑setting. This creates a stable environment for public‑good data and models.
EVIDENCE
He presented the four-pillar policy framework and highlighted the pillar on institutionalising sustainable open-source infrastructure as a core element of the approach [70-73].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
A four-pillar policy framework that institutionalises sustainable open-source infrastructure is presented in the discussion, while later IGF remarks raise concerns about verification and procurement of open-source code [S2][S15][S28].
MAJOR DISCUSSION POINT
Policy Framework for Sustainable Open‑Source Infrastructure
AGREED WITH
Harleen Kaur, Thomas J. Vallianeth
H
Harleen Kaur
6 arguments143 words per minute1036 words432 seconds
Argument 1
Policy Pillars for Inclusion
EXPLANATION
The policy report structures its inclusion strategy around four pillars: treating foundational datasets as public goods, institutionalising sustainable open‑source infrastructure, building open and representative models, and strengthening responsible deployment. Together they provide a roadmap for inclusive voice AI.
EVIDENCE
She outlined the four pillars-public-good data, sustainable open-source infrastructure, open representative models, and responsible deployment-and described each pillar’s focus in the presentation [73-78].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The four-pillar inclusion strategy (public-good data, sustainable open-source, open models, responsible deployment) is detailed in the Bhashini report and aligns with broader inclusive ICT policy recommendations [S2][S15][S26].
MAJOR DISCUSSION POINT
Policy Pillars for Inclusion
AGREED WITH
Amitabh Nag, Prasanta Ghosh, Nihar Desai
Argument 2
Treat Foundational Datasets as Public Goods
EXPLANATION
Foundational speech datasets should be funded and convened as public‑good resources, especially for languages that are not commercially viable. This ensures that essential linguistic diversity is preserved and made accessible.
EVIDENCE
She explicitly stated that treating foundational datasets as public goods involves government support for funding and convening, particularly for non-commercial languages [79-81].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Treating foundational speech datasets as publicly funded, especially for non-commercial languages, is emphasized in the policy brief and echoed in cross-border data-goods discussions [S2][S21].
MAJOR DISCUSSION POINT
Treat Foundational Datasets as Public Goods
AGREED WITH
Ariane Ahildur, Amitabh Nag, Nihar Desai
Argument 3
Embedding RAI Practices in the Toolkit
EXPLANATION
The developer toolkit translates responsible AI (RAI) principles into concrete practices across the entire development lifecycle, covering representation, data quality, and continuous monitoring. This ensures that developers can build inclusive voice systems from the start.
EVIDENCE
She described how the toolkit embeds RAI through best-practice guidance on representation, data quality, evaluation and lifecycle integration, providing concrete steps for developers [90-108].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The developer toolkit translates Responsible AI principles into concrete practices, drawing on the Hamburg Declaration’s RAI framework and multistakeholder responsible-AI initiatives [S17][S15][S19].
MAJOR DISCUSSION POINT
Embedding RAI Practices in the Toolkit
Argument 4
Institutionalising Open‑Source Governance and Standards
EXPLANATION
Governments should act as stewards and standard‑setters for open‑source voice ecosystems, establishing governance frameworks, documentation standards and collaborative data‑steward models. This creates trustworthy, interoperable resources for the community.
EVIDENCE
She advocated for governments to be ecosystem conveners, standard setters and to institutionalise sustainable open-source infrastructure, noting the need for standardisation of documents and collaborative stewardship models [71-76] and further emphasizing documentation and standardisation in later slides [84-85].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Calls for government stewardship, standard-setting and collaborative data-steward models are supported by IGF calls for inclusive standard-setting, while later remarks highlight verification and procurement challenges for open-source code [S15][S28][S29].
MAJOR DISCUSSION POINT
Institutionalising Open‑Source Governance and Standards
AGREED WITH
Nihar Desai, Thomas J. Vallianeth
Argument 5
Continuous post‑deployment monitoring and standardized documentation
EXPLANATION
Kaur advocates for ongoing monitoring of voice AI systems after deployment, using standardized model cards, data cards, and transcription benchmarks to ensure responsible performance over time.
EVIDENCE
She references robust transcription standards, contextual benchmarks, data cards, model cards, and continuous post-deployment monitoring as part of the toolkit recommendations [103].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The recommendation to use model cards, data cards and benchmark suites for ongoing monitoring aligns with IGF discussions on standardized documentation and legal safeguards for open models [S15][S28].
MAJOR DISCUSSION POINT
Post‑deployment monitoring and documentation
Argument 6
Synthetic and layered data strategies to enhance representation
EXPLANATION
Kaur proposes using synthetic data generation and a hybrid, layered data collection approach to broaden linguistic representation while avoiding reliance on a single data source.
EVIDENCE
She describes having a diversity wish list, using synthetic data, and employing a hybrid layered structure to make models more diverse, emphasizing not to collect data from only one source [97-100].
MAJOR DISCUSSION POINT
Synthetic and layered data for inclusive AI
A
Amitabh Nag
6 arguments162 words per minute1513 words558 seconds
Argument 1
Inclusion by Design and Continuous Upgrade
EXPLANATION
AI systems must be built with inclusion and diversity baked into their design, recognizing that language, culture and individual differences evolve rapidly. Consequently, models need continual updates rather than static, long‑lived deployments.
EVIDENCE
He explained that diversity of persons, languages and cultures makes inclusion a core design element, and that AI systems lack warranties and must be continuously upgraded, unlike static machines [9-16] and [5-6].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Design-first inclusion and the need for continual model updates are echoed in AI4All’s design pillar and IGF calls for inclusive standard-setting [S26][S15].
MAJOR DISCUSSION POINT
Inclusion by Design and Continuous Upgrade
AGREED WITH
Ariane Ahildur, Harleen Kaur, Nihar Desai
Argument 2
Continuous, Feedback‑Driven Data Creation
EXPLANATION
Data creation should be an ongoing process that combines brute‑force collection with feedback loops from deployed products, generating primary and improvement corpora that are fed back into models. This creates a self‑reinforcing data ecosystem.
EVIDENCE
He described two approaches: traditional field collection to capture diversity and leveraging product usage to automatically generate parallel corpora, followed by annotation and feedback pipelines that continuously enrich the model [121-144].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Feedback loops that turn product usage into new training data are described in the flywheel model and reinforced by the health-AI summit’s example of a self-reinforcing data ecosystem [S23][S2].
MAJOR DISCUSSION POINT
Continuous, Feedback‑Driven Data Creation
AGREED WITH
Harleen Kaur, Prasanta Ghosh, Nihar Desai
DISAGREED WITH
Prasanta Ghosh
Argument 3
Audience Perception as Evaluation Criterion
EXPLANATION
The ultimate measure of a voice model’s success is whether the audience understands and accepts it; perception, tone and pronunciation matter more than abstract metrics. Different stakeholders may rank models differently based on their own expectations.
EVIDENCE
He argued that evaluation is decided by the audience’s ability to understand the output, noting varied preferences across ministries and contexts, and emphasizing perception over absolute rankings [256-270].
MAJOR DISCUSSION POINT
Audience Perception as Evaluation Criterion
AGREED WITH
Prasanta Ghosh, Thomas J. Vallianeth
DISAGREED WITH
Prasanta Ghosh, Thomas J. Vallianeth
Argument 4
Leveraging Product Feedback Loops for Model Improvement
EXPLANATION
Enterprise users should be able to flag mismatches between AI‑generated summaries and manual expectations, feeding these corrections back into the model to drive continuous improvement. Such conscious feedback programs turn every interaction into a data source.
EVIDENCE
He gave the example of a user noticing a summary discrepancy, recording it, and feeding it back into the model, highlighting the need for systematic feedback mechanisms in enterprise applications [139-144].
MAJOR DISCUSSION POINT
Leveraging Product Feedback Loops for Model Improvement
Argument 5
Scaling inclusive voice AI solutions globally
EXPLANATION
Nag argues that the challenges and solutions discussed should extend beyond India to regions such as Southeast Asia and Africa, requiring scalable policies and toolkits that can address diverse linguistic and cultural contexts worldwide.
EVIDENCE
He mentions the importance of scaling solutions to Southeast Asia, Africa, and other places, and highlights the need for replicable policies, standards, and toolkits that can be adapted across regions [1-3].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The push to extend Indian-origin voice AI solutions to Southeast Asia and Africa is supported by India’s public compute infrastructure rollout and multistakeholder partnership narratives [S24][S15][S19].
MAJOR DISCUSSION POINT
Global scaling of inclusive voice AI
AGREED WITH
Ariane Ahildur, Harleen Kaur
Argument 6
Replicable policy and toolkit frameworks as enablers
EXPLANATION
Nag stresses that the policies, standards, and toolkits developed for voice AI should be designed for replication, enabling other organizations and countries to adopt proven approaches without reinventing the wheel.
EVIDENCE
He states that there are policies, standards, and toolkits which have been developed and can actually be replicated [3].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The notion that policies, standards and toolkits should be designed for replication is highlighted in multistakeholder partnership reports and the Bhashini policy framework [S19][S2].
MAJOR DISCUSSION POINT
Replicable frameworks for voice AI deployment
AGREED WITH
Ariane Ahildur, Harleen Kaur
DISAGREED WITH
Thomas J. Vallianeth
P
Prasanta Ghosh
2 arguments160 words per minute1184 words443 seconds
Argument 1
Need for Multi‑Layered, Context‑Sensitive Evaluation
EXPLANATION
Because human annotators disagree on transcriptions, evaluation must move beyond simple word‑error‑rate metrics to multi‑layered, context‑aware approaches that consider variability, downstream task tolerance and subjective judgments. A combination of objective and human‑centric assessments is required.
EVIDENCE
He recounted instances where two annotators from nearby villages produced different transcriptions, arguing that evaluation should accommodate such variability through multi-layered metrics, alternative outputs and downstream task-specific assessments [228-244].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Calls for multi-layered, context-aware evaluation metrics align with IGF discussions on inclusive standards and the need for robust evaluation frameworks for open models [S15][S28].
MAJOR DISCUSSION POINT
Need for Multi‑Layered, Context‑Sensitive Evaluation
AGREED WITH
Amitabh Nag, Thomas J. Vallianeth
DISAGREED WITH
Amitabh Nag, Thomas J. Vallianeth
Argument 2
Intrinsic linguistic component modeling to reduce data collection costs
EXPLANATION
Ghosh suggests that instead of brute‑force data gathering across all dialects, modeling can start from intrinsic linguistic bases (e.g., Indo‑Aryan vs Dravidian families) and then expand to dialectal variations, lowering cost and time.
EVIDENCE
He explains the concept of intrinsic basis components, using the example of Indian language families and balancing data collection with modeling to achieve coverage with reduced resources [160-168].
MAJOR DISCUSSION POINT
Efficient modeling via intrinsic linguistic components
AGREED WITH
Amitabh Nag, Harleen Kaur, Nihar Desai
DISAGREED WITH
Amitabh Nag
K
Kritika K.R.
4 arguments138 words per minute432 words186 seconds
Argument 1
Application‑Level Evaluation and Domain Adaptation
EXPLANATION
Different industry domains require tailored evaluation and model adaptation; voice AI must be fine‑tuned to specific jargon, workflows and compliance needs. Leveraging LLMs and custom data sets helps align models with sector‑specific requirements.
EVIDENCE
She described how industry applications need domain-specific data, custom tuning and compliance considerations, and how LLMs can be adapted to industry jargon and workflows, enabling on-premise deployment for security [190-199] and further detailed the process of sourcing open-source data, fine-tuning and compliance for specific use cases [245-254].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Domain-specific tuning and the use of LLMs for industry jargon are discussed in AI4All’s application pillar and IGF notes on sector-specific AI deployment [S26][S15].
MAJOR DISCUSSION POINT
Application‑Level Evaluation and Domain Adaptation
Argument 2
Scalable, Edge‑Ready Infrastructure for Enterprise Use
EXPLANATION
For widespread adoption, voice models must be lightweight, scalable and capable of running on edge devices, ensuring low latency and data‑privacy for enterprise deployments across sectors such as healthcare, manufacturing and logistics.
EVIDENCE
She highlighted the need for scalable, sustainable infrastructure, optimized models and edge deployments to enable real-world adoption across multiple industries [196-199].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The need for lightweight, edge-deployable voice models is reinforced by India’s public compute capacity plan and AI4All’s emphasis on scalable, sustainable AI infrastructure [S24][S26].
MAJOR DISCUSSION POINT
Scalable, Edge‑Ready Infrastructure for Enterprise Use
Argument 3
On‑premise deployment for security and compliance in enterprise AI
EXPLANATION
Kritika emphasizes that deploying voice AI models on‑premises allows enterprises to maintain data security, meet compliance requirements, and tailor models to sector‑specific needs.
EVIDENCE
She notes that open-source models enable on-prem deployment, which supports security and compliance for different core industry applications [254-259].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
On-premise deployment for data security and compliance is highlighted alongside concerns about procurement law compliance for open-source software in IGF sessions [S28].
MAJOR DISCUSSION POINT
On‑premise deployment for secure enterprise AI
Argument 4
Combining voice AI with large language models for domain‑specific adaptation
EXPLANATION
She argues that integrating voice AI with LLMs facilitates customization to industry jargon and workflows, improving performance in specialized sectors.
EVIDENCE
She describes how LLMs can be adapted to industry jargon and core workflows, enabling models to be fine-tuned for specific use cases [250-254].
MAJOR DISCUSSION POINT
Voice AI + LLM integration for domain adaptation
T
Thomas J. Vallianeth
5 arguments171 words per minute1138 words397 seconds
Argument 1
Legal Foundations for Data Ownership and Privacy
EXPLANATION
Voice datasets sit at the intersection of privacy and copyright law; therefore, projects must assess provenance, secure appropriate licences and apply privacy‑enhancing technologies from the outset. Robust documentation is essential to maintain a trusted downstream ecosystem.
EVIDENCE
He explained that publicly available data may still be copyrighted, requiring careful provenance checks and licences, and that privacy-enhancing techniques can be used to avoid personal data capture; he also stressed the need for strong documentation to enable safe downstream use [208-214][215-218].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The importance of provenance, licensing and privacy-enhancing technologies for voice datasets is discussed in IGF standards and legal-risk analyses, with additional critique on data-as-public-good policies in certain jurisdictions [S15][S28][S22].
MAJOR DISCUSSION POINT
Legal Foundations for Data Ownership and Privacy
DISAGREED WITH
Amitabh Nag
Argument 2
Subjectivity, Evidence, and Trust in Legal Evaluation
EXPLANATION
Legal assessment of AI outputs involves subjective judgments; establishing trust requires clear documentation, privacy safeguards and demonstrable safeguards from the beginning. Courts will need evidentiary standards that reflect these safeguards.
EVIDENCE
He discussed how subjectivity in evaluating harmful content can be mitigated by embedding safeguards early, and how documentation and demonstrated high-level safeguards reduce evidentiary uncertainty for courts [285-298].
MAJOR DISCUSSION POINT
Subjectivity, Evidence, and Trust in Legal Evaluation
Argument 3
Documentation, Licensing, and Privacy‑Enhancing Measures
EXPLANATION
Effective legal compliance hinges on clear documentation of data provenance, appropriate open‑source licences and the use of privacy‑enhancing technologies. These measures protect both data subjects and downstream users.
EVIDENCE
He reiterated the importance of documenting copyright provenance, applying suitable licences and employing privacy-enhancing techniques to ensure that downstream usage remains lawful and trustworthy [208-218].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Robust documentation, appropriate open-source licences and privacy-enhancing techniques are emphasized in the open-source legal safeguards discussion [S28].
MAJOR DISCUSSION POINT
Documentation, Licensing, and Privacy‑Enhancing Measures
AGREED WITH
Harleen Kaur, Nihar Desai
Argument 4
End‑Use Safeguards and Licensing Choices for Open Models
EXPLANATION
Licensing and safeguards must be chosen based on the intended downstream application; a model for hate‑speech detection requires different controls than one for speech‑to‑speech translation. Tailoring licences and safeguards to use‑case reduces risk.
EVIDENCE
He gave the example of differing safeguards for hate-speech detection versus speech-to-speech translation, arguing that licensing frameworks and documentation should reflect the specific end-use and downstream user expectations [311-313].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
Tailoring licences and safeguards to specific downstream applications is recommended in the IGF open-source licensing debate [S28].
MAJOR DISCUSSION POINT
End‑Use Safeguards and Licensing Choices for Open Models
Argument 5
National‑level evaluation framework for Indian language AI
EXPLANATION
Vallianeth calls for the creation of a standardized, country‑wide evaluation framework and regular benchmarking contests to assess Indian language speech models, fostering both competition and collaboration.
EVIDENCE
He mentions the need for a national level framework for evaluation, referencing NIST-style protocols and annual assessments to track progress across languages and dialects [313-314].
MAJOR DISCUSSION POINT
National evaluation framework for Indian language AI
AGREED WITH
Prasanta Ghosh, Amitabh Nag
DISAGREED WITH
Amitabh Nag, Prasanta Ghosh
M
Moderator
1 argument68 words per minute267 words232 seconds
Argument 1
Multi‑stakeholder convening as catalyst for voice AI ecosystem
EXPLANATION
The moderator stresses that bringing together representatives from government, industry, academia, and civil society is essential to foster collaboration, share expertise, and accelerate the development and deployment of inclusive voice technologies.
EVIDENCE
The moderator invites participants from GIZ, Tri-Legal, Art Park, NASSCOM, and Digital Futures Lab to the stage and later assembles a panel that includes the CEO of DIBD, an academic professor, an AI product researcher, and legal counsel, demonstrating a deliberate effort to create a multi-sector dialogue [57-61][112-113].
EXTERNAL EVIDENCE (KNOWLEDGE BASE)
The necessity of multi-stakeholder dialogue for inclusive voice AI is underscored in IGF calls for broader participation in standard-setting and in multistakeholder partnership reports [S15][S19][S20].
MAJOR DISCUSSION POINT
Multi‑stakeholder convening for ecosystem development
Agreements
Agreement Points
Voice technology and speech datasets should be treated as public goods to promote digital inclusion
Speakers: Ariane Ahildur, Amitabh Nag, Harleen Kaur, Nihar Desai
Inclusive Voice as Public Good Inclusion by Design and Continuous Upgrade Treat Foundational Datasets as Public Goods Data as Digital Public Good
All four speakers stress that voice AI and foundational speech datasets must be openly available and designed for inclusion, especially for low-literacy and underserved populations, so that they become public assets that can be replicated and scaled. [34-38][9-16][79-81][118-119][146-148]
POLICY CONTEXT (KNOWLEDGE BASE)
This aligns with the Digital Public Goods framework that advocates treating language and AI technologies as shared infrastructure for inclusion [S52] and echoes calls to treat language technology as public infrastructure for preserving endangered languages [S54].
Datasets need continuous, feedback‑driven enrichment (flywheel model) rather than static collection
Speakers: Amitabh Nag, Harleen Kaur, Prasanta Ghosh, Nihar Desai
Continuous, Feedback‑Driven Data Creation Policy Pillars for Inclusion Intrinsic linguistic component modeling to reduce data collection costs Flywheel Model for Ongoing Dataset Enrichment
The speakers agree that data creation must be an ongoing process that combines field collection with automatic generation from product usage and smart linguistic modeling, creating a virtuous flywheel that continuously improves models. [121-144][89-100][160-168][117-120][146-148]
Current evaluation metrics are insufficient; a multi‑layered, context‑sensitive evaluation framework is needed
Speakers: Prasanta Ghosh, Amitabh Nag, Thomas J. Vallianeth
Need for Multi‑Layered, Context‑Sensitive Evaluation Audience Perception as Evaluation Criterion National‑level evaluation framework for Indian language AI
All three highlight that simple word-error-rate scores do not capture the variability of human transcription or downstream task requirements, calling for richer, multi-dimensional evaluation methods and a national benchmarking system. [228-244][256-270][313-314]
POLICY CONTEXT (KNOWLEDGE BASE)
Reflects concerns raised about the need for context-sensitive evaluation versus standardized benchmarks, as discussed in the AI governance debate on evaluation frameworks [S65] and the EU AI Act’s requirement for detailed evaluation strategies [S63].
Sustainable open‑source infrastructure and governance must be institutionalised with clear documentation and licensing
Speakers: Harleen Kaur, Nihar Desai, Thomas J. Vallianeth
Institutionalising Open‑Source Governance and Standards Policy Framework for Sustainable Open‑Source Infrastructure Documentation, Licensing, and Privacy‑Enhancing Measures
The panel concurs that governments should act as stewards, setting standards, maintaining robust documentation, and applying appropriate licences and privacy-enhancing techniques to build trustworthy open-source voice ecosystems. [71-76][84-85][70-73][208-218]
POLICY CONTEXT (KNOWLEDGE BASE)
Echoes the challenges identified regarding long-term hosting, licensing costs and governance of open-source assets [S44] and the call to recognize the open-source community as a critical stakeholder in policy design [S46].
Policies, toolkits and open models should be designed for replication and global scaling
Speakers: Amitabh Nag, Ariane Ahildur, Harleen Kaur
Scaling inclusive voice AI solutions globally Open‑source voice models for nine Indian languages empower diverse stakeholders Replicable policy and toolkit frameworks as enablers
These speakers emphasize that the frameworks, toolkits and open voice models developed in India are intended to be replicated and adapted for other regions such as Southeast Asia and Africa, enabling broader impact. [1-3][19][43-45][3]
POLICY CONTEXT (KNOWLEDGE BASE)
Consistent with the multistakeholder ecosystem approach for scaling digital infrastructure highlighted in IGF discussions on inclusive participation [S48] and the emphasis on capacity-building for scalable solutions [S41].
Similar Viewpoints
Both stress that licensing, safeguards and deployment choices must be aligned with the specific downstream application—e.g., hate‑speech detection versus speech‑to‑speech translation—to ensure security and compliance. [254-259][311-313]
Speakers: Kritika K.R., Thomas J. Vallianeth
End‑premise deployment for security and compliance in enterprise AI End‑Use Safeguards and Licensing Choices for Open Models
Both frame voice technology and foundational speech data as public goods that require open access, trust, and safety mechanisms. [34-38][118-119]
Speakers: Ariane Ahildur, Nihar Desai
Inclusive Voice as Public Good Data as Digital Public Good
Both recognize that domain‑specific needs (dialects, industry jargon) demand tailored data collection and modeling strategies to balance cost and performance. [160-168][245-254]
Speakers: Prasanta Ghosh, Kritika K.R.
Intrinsic linguistic component modeling to reduce data collection costs Application‑Level Evaluation and Domain Adaptation
Unexpected Consensus
Importance of documentation and early safeguards to build trust across legal and technical domains
Speakers: Thomas J. Vallianeth, Amitabh Nag
Legal Foundations for Data Ownership and Privacy Audience Perception as Evaluation Criterion
While Thomas discusses documentation to satisfy legal evidentiary standards, Nag emphasizes audience understanding as the ultimate metric; both nonetheless converge on the idea that clear, upfront documentation and safeguards are essential for trustworthiness of AI systems. [208-218][256-270]
POLICY CONTEXT (KNOWLEDGE BASE)
Matches the emphasis on building legal certainty and trust in data flows through clear documentation [S43] and the HUDERIA methodology that bridges legal and technical safeguards [S61].
Recognition that evaluation challenges are a shared responsibility rather than belonging to a single stakeholder group
Speakers: Prasanta Ghosh, Harleen Kaur, Amitabh Nag
Need for Multi‑Layered, Context‑Sensitive Evaluation Embedding RAI Practices in the Toolkit Audience Perception as Evaluation Criterion
Academic (Ghosh), policy (Kaur) and industry (Nag) participants all agree that evaluation must be multi-dimensional, involve responsible AI practices, and consider end-user perception, indicating a cross-sector consensus on rethinking evaluation. [228-244][90-108][256-270]
POLICY CONTEXT (KNOWLEDGE BASE)
Aligns with the call for multi-stakeholder collaboration in AI governance, noting the open-source community’s role and the need to avoid regulatory overreach [S46].
Overall Assessment

There is strong consensus that inclusive voice AI must be treated as a public good, that data and models require continuous, feedback‑driven enrichment, that open‑source governance and robust documentation are essential, and that evaluation metrics need to evolve beyond simple error rates to multi‑layered, context‑aware frameworks. Participants also agree on the need for scalable, replicable policies and toolkits to extend impact globally.

High consensus across government, academia, industry and legal stakeholders, indicating a solid foundation for coordinated policy action, standard‑setting and investment in sustainable voice AI ecosystems.

Differences
Different Viewpoints
Evaluation methodology for voice AI systems
Speakers: Amitabh Nag, Prasanta Ghosh, Thomas J. Vallianeth
Audience Perception as Evaluation Criterion Need for Multi‑Layered, Context‑Sensitive Evaluation National‑level evaluation framework for Indian language AI
Nag argues that the ultimate measure of a model is whether the audience understands and accepts it, rejecting absolute rankings and emphasizing perception [256-270]. Prasanta counters that human annotator variability makes word-error-rate insufficient and calls for multi-layered, context-aware metrics that combine objective and subjective assessments [228-244]. Thomas adds that a standardized, country-wide evaluation framework with regular benchmarking (similar to NIST) is needed to provide comparable, objective scores across languages and dialects [313-314].
POLICY CONTEXT (KNOWLEDGE BASE)
The EU AI Act specifies detailed evaluation criteria and methodologies for AI systems, providing a policy backdrop for debates on voice AI evaluation [S63].
Approach to data collection and corpus creation
Speakers: Amitabh Nag, Prasanta Ghosh
Continuous, Feedback‑Driven Data Creation Intrinsic linguistic component modeling to reduce data collection costs
Nag proposes a two-pronged strategy: continued brute-force field collection to capture diversity, plus leveraging product usage to generate primary and improvement corpora that feed back into models [124-131]. Prasanta suggests a more efficient route that starts from intrinsic linguistic bases (e.g., Indo-Aryan vs Dravidian families) and then expands to dialects, lowering the need for exhaustive data gathering [160-168].
POLICY CONTEXT (KNOWLEDGE BASE)
Highlights the data lifecycle stages (creation, collection, organization, use) outlined in AI data management primers [S56] and the open-data policy that governs data collection practices while balancing privacy concerns [S58].
Legal safeguards versus open‑source public‑good framing
Speakers: Thomas J. Vallianeth, Amitabh Nag
Legal Foundations for Data Ownership and Privacy Replicable policy and toolkit frameworks as enablers
Thomas stresses that voice datasets, even if publicly available, may be copyrighted and require careful provenance checks, appropriate licences, privacy-enhancing techniques, and robust documentation to ensure downstream trust [208-218]. Nag treats open-source voice technologies as public goods that can be replicated and scaled globally, focusing on inclusion and continuous upgrade without foregrounding legal constraints [3][16].
POLICY CONTEXT (KNOWLEDGE BASE)
Reflects the tension between fostering open-source commons and ensuring legal compliance, as discussed in the context of data free flow with trust [S43] and concerns about procurement and licensing laws for open-source code [S47].
Unexpected Differences
Subjectivity of evaluation versus demand for objective national benchmarks
Speakers: Amitabh Nag, Thomas J. Vallianeth
Audience Perception as Evaluation Criterion National‑level evaluation framework for Indian language AI
While both aim for trustworthy AI, Nag dismisses the possibility of objective ranking, stating that evaluation is decided by audience perception and cannot produce a “best” model [256-270]. Thomas, however, calls for a standardized national framework with regular benchmarking contests to create comparable, objective metrics [313-314]. This clash between a perception-based view and a formalized, metric-driven approach was not anticipated given their shared focus on reliability.
POLICY CONTEXT (KNOWLEDGE BASE)
Echoes the observed tension between contextual evaluation approaches and the institutional demand for standardized benchmarks noted in AI governance forums [S65] and the EU AI Act’s objective evaluation requirements [S63].
Open‑source scaling versus legal licensing and privacy concerns
Speakers: Amitabh Nag, Thomas J. Vallianeth
Replicable policy and toolkit frameworks as enablers Legal Foundations for Data Ownership and Privacy
Nag promotes open-source voice technologies as freely replicable public goods for rapid scaling [3][16], whereas Thomas warns that even open datasets may be subject to copyright and privacy law, requiring careful licensing, provenance checks, and documentation to protect downstream users [208-218][311-313]. The tension between an unrestricted open-source vision and a cautious legal compliance stance was not foreseen.
POLICY CONTEXT (KNOWLEDGE BASE)
Corresponds to discussions on long-term licensing costs and sustainability of open-source assets [S44], as well as privacy and legal restrictions on data release in open-data policies [S58] and procurement law constraints on open-source use [S47].
Overall Assessment

The discussion revealed three principal fault lines: (1) how to evaluate voice AI—whether through audience perception, multi‑layered/context‑aware metrics, or standardized national benchmarks; (2) the optimal data‑collection strategy—brute‑force field work plus product feedback versus linguistically‑informed modeling to cut costs; (3) the balance between an open‑source public‑good mindset and the legal safeguards required for copyright and privacy. While participants share common goals of inclusivity, continuous improvement, and multi‑stakeholder collaboration, they diverge on concrete pathways to achieve these goals.

Moderate to high. The disagreements are substantive enough to affect policy design, funding allocations, and implementation road‑maps, requiring coordinated effort to reconcile technical, legal, and evaluation perspectives for a coherent voice AI ecosystem.

Partial Agreements
All participants concur that datasets cannot be static; they should be continuously enriched through feedback loops, post‑deployment monitoring, and community contributions, ensuring models improve over time [124-131][121-144][103][146-148].
Speakers: Amitabh Nag, Prasanta Ghosh, Harleen Kaur, Nihar Desai
Continuous, Feedback‑Driven Data Creation Flywheel model for ongoing dataset enrichment Continuous post‑deployment monitoring and standardized documentation Datasets must be “lived‑in” and built upon by users
The speakers share the goal of treating foundational speech datasets as digital public goods that are openly available to support inclusion and public services, especially for low‑literacy and non‑commercial language communities [118-119][146-148][79-81][34-38].
Speakers: Nihar Desai, Harleen Kaur, Ariane Ahildur
Data as Digital Public Good Treat Foundational Datasets as Public Goods Inclusive Voice as Public Good
All agree that a multi‑stakeholder, collaborative approach—bringing together government, industry, academia, and civil society—is essential to accelerate inclusive voice AI development and deployment [57-61][112-113][41-43][26-28].
Speakers: Moderator, Ariane Ahildur, Harleen Kaur
Multi‑stakeholder convening as catalyst for voice AI ecosystem Indo‑German partnership and cooperation Joint effort involving distinguished partners and experts
Takeaways
Key takeaways
Inclusive voice AI must be treated as a public good, requiring continuous, diversity‑aware design and regular updates (Nag, Ahildur). Foundational speech datasets should be created, maintained, and enriched as digital public goods through ongoing collection, user feedback loops, and open‑domain pipelines (Nag, Desai, Kaur). A four‑pillar policy framework is proposed: treat data as public goods, institutionalise sustainable open‑source infrastructure, build open and representative models, and strengthen responsible deployment (Kaur). Evaluation of speech systems needs multi‑layered, context‑sensitive metrics that go beyond simple word‑error‑rate, incorporating audience perception, downstream task tolerance, and human judgment (Ghosh, Nag, K.R.). Legal and governance aspects must address copyright, privacy, licensing, and documentation from the outset to ensure trustworthy, compliant ecosystems (Vallianeth). Open‑source stewardship, standardisation, and national‑level benchmarking are essential for scaling inclusive voice technologies across India’s linguistic diversity (Ghosh, Desai). Industry adoption hinges on scalable, edge‑ready infrastructure, domain‑specific model fine‑tuning, and safeguards that align with compliance and security requirements (K.R., Nag).
Resolutions and action items
Commit to treat foundational speech datasets as Digital Public Goods and to fund/convene efforts for under‑served languages (policy recommendation). Develop and publish a developer toolkit that embeds Responsible AI practices, diversity planning, data‑quality checks, and post‑deployment monitoring (Kaur). Establish a continuous data‑flywheel: collect primary corpora, generate improvement corpora from deployed products, and feed back into model training (Nag). Initiate regular workshops and stakeholder meetings to co‑design a national, multi‑layered evaluation framework and annual benchmarking leaderboard for Indian languages (Ghosh, Desai). Implement documentation standards, privacy‑enhancing techniques, and clear licensing strategies for all datasets and models from the start (Vallianeth). Encourage governments to act as ecosystem stewards and standard‑setters, not only regulators, by supporting open‑source infrastructure and public‑good funding mechanisms (Kaur, Desai).
Unresolved issues
Exact methodology for a unified, India‑wide evaluation benchmark that accommodates linguistic variability and subjective audience perception. How to balance cost‑effective data collection with the need for comprehensive dialect coverage without a clear, agreed‑upon trade‑off model. Legal evidentiary standards for disputes over AI outputs and how courts will assess compliance with privacy and copyright requirements. Specific mechanisms for ensuring end‑use safeguards and licensing choices for open‑source models in sensitive applications (e.g., hate‑speech detection). Operational details for scaling edge‑deployment infrastructure across diverse industry sectors.
Suggested compromises
Adopt a hybrid data‑collection strategy: combine brute‑force primary corpus gathering with targeted intrinsic‑component sampling to reduce cost while preserving diversity (Ghosh, Nag). Use both objective metrics (e.g., error rates) and subjective audience‑perception assessments to evaluate models, acknowledging that perfect ranking may be unattainable (Nag, Ghosh). Allow open‑source datasets for general use but require additional licensing or safeguards for high‑risk applications, tailoring the approach to end‑use scenarios (Vallianeth). Blend government stewardship with community‑driven open‑source governance, sharing responsibility for standards, funding, and sustainability (Kaur, Desai). Implement multi‑layered evaluation pipelines that feed back into model improvement, thereby aligning academic rigor with industry practicality (K.R., Ghosh).
Thought Provoking Comments
AI systems have a very short shelf life – sometimes only three to six months – because of the immense diversity of people, languages, and cultures. Unlike static machines, AI must be continuously upgraded and inclusion has to be built into the design.
Highlights the fundamentally dynamic nature of AI compared to traditional technology and stresses that diversity and inclusion are not add‑ons but core design constraints, reframing how stakeholders should think about sustainability.
Set the stage for the entire discussion, prompting participants to consider continuous data collection, feedback loops, and policy mechanisms rather than one‑off solutions. It led directly to Nihar’s question about treating datasets as Digital Public Goods and to Amitabh’s later elaboration on primary vs. improvement corpora.
Speaker: Amitabh Nag
Voice AI is a gateway to public services for millions with limited literacy; when it works in local languages it enables inclusion, but when it doesn’t it can reinforce exclusion. This is a narrative of cooperation, not competition, between India and Germany.
Frames voice technology as a social equity issue and positions the Indo‑German partnership as a model of collaborative, responsible AI, shifting the conversation from technical details to broader societal impact.
Reoriented the audience toward the ethical stakes of the work, reinforcing Harleen’s policy pillars and encouraging the panel to discuss inclusion not just as a technical challenge but as a shared value.
Speaker: Ariane Ahildur
Our policy framework rests on four pillars: treating foundational datasets as public goods, institutionalising sustainable open‑source infrastructure, building open and representative models, and strengthening responsible deployment.
Provides a concrete, actionable structure that bridges high‑level policy with developer‑level practice, making the abstract goals of inclusion and responsibility tangible.
Guided the subsequent panel questions, especially Nihar’s probing about data‑as‑public‑good and Amitabh’s discussion of continuous data creation. It also gave a reference point for the legal and evaluation debates that followed.
Speaker: Harleen Kaur
Instead of brute‑force data collection across every dialect, we can start from intrinsic linguistic families (Indo‑Aryan vs Dravidian) and then strategically collect data that covers common acoustic bases, using smart trade‑offs to reduce cost and time.
Introduces a novel, linguistically informed methodology that could dramatically improve efficiency of dataset creation while preserving coverage, challenging the prevailing assumption that more data is always better.
Shifted the conversation from quantity to strategic quality, prompting Amitabh to elaborate on primary vs. improvement corpora and inspiring later discussion on evaluation metrics that respect linguistic variation.
Speaker: Prasanta Ghosh
Data sets sit at the intersection of privacy and copyright law. Even publicly available data may be copyrighted, so we must verify provenance, use privacy‑enhancing technologies, and maintain rigorous documentation from the start.
Brings a critical legal dimension that many technical participants overlook, emphasizing that compliance is not an afterthought but a design requirement.
Prompted the panel to consider legal safeguards alongside technical solutions, influencing Amitabh’s later remarks about trust and leading Thomas to later discuss how documentation can reduce subjectivity in legal disputes.
Speaker: Thomas J. Vallianeth
Human transcribers rarely agree word‑for‑word; therefore, using word error rate alone is insufficient. We need multi‑layered evaluation, possibly returning multiple hypotheses, and linking ASR performance to downstream task outcomes.
Challenges the dominant evaluation paradigm, exposing its inadequacy for Indian linguistic diversity and proposing a more nuanced, application‑centric assessment approach.
Catalysed a deeper dive into evaluation methods, leading Amitabh to argue that audience acceptance matters more than absolute scores, and setting up the later call for a national leaderboard.
Speaker: Prasanta Ghosh
Evaluation should be judged by whether the audience understands and accepts the output, not by ranking models as first, second, or third. Different contexts (court, meeting) demand different levels of purity and tolerance.
Reframes evaluation from an objective ranking to a user‑centric acceptance model, highlighting the contextual nature of ‘accuracy’ in real‑world deployments.
Steered the discussion toward practical deployment concerns, resonating with Kritika’s focus on scalable infrastructure and prompting Thomas to discuss trust‑by‑design as a way to manage subjective judgments.
Speaker: Amitabh Nag
Legal disputes will increasingly involve subjective judgments about AI outputs. Building trust through upfront safeguards, thorough documentation, and privacy‑enhancing measures can reduce the need for courts to arbitrate nuanced cases.
Offers a forward‑looking solution that links technical governance with legal risk mitigation, acknowledging the evolving nature of AI jurisprudence.
Provided a concluding bridge between the technical, policy, and legal strands of the conversation, reinforcing the earlier call for holistic documentation and influencing the final consensus on needing more workshops and a unified evaluation framework.
Speaker: Thomas J. Vallianeth
Overall Assessment

The discussion was shaped by a series of pivotal insights that moved it from a generic launch event to a deep, interdisciplinary exploration of voice AI in India. Amitabh’s opening remark about AI’s fleeting shelf‑life framed the need for continuous, inclusive data pipelines, which Harleen then codified into a four‑pillar policy. Prasanta’s linguistic‑family approach and critique of word‑error‑rate evaluation introduced strategic efficiency and methodological rigor, prompting the panel to rethink data collection and performance metrics. Ariane’s emphasis on inclusion and cooperation set a moral compass, while Thomas’s legal analysis anchored the conversation in compliance and trust‑by‑design. Together, these comments redirected the dialogue toward user‑centric evaluation, sustainable open‑source ecosystems, and proactive legal safeguards, culminating in a consensus that future progress will require coordinated workshops, national evaluation standards, and a holistic, trust‑engineered approach.

Follow-up Questions
What mechanisms are needed to continuously create and facilitate digital public good voice datasets while ensuring trust and safety, and can a data‑flywheel model be established?
Understanding sustainable data pipelines is crucial for keeping voice AI models up‑to‑date and trustworthy, especially given rapid changes in language use.
Speaker: Nihar Desai
What are the current gaps at the research and academia level in designing inclusive datasets that lead to better downstream applications?
Identifying academic shortcomings will guide targeted research to improve dataset representativeness and model performance across diverse Indian languages.
Speaker: Nihar Desai
Can you provide concrete examples of how initiatives (e.g., ResPin) balanced inclusivity with model‑building constraints and other factors?
Real‑world case studies illustrate practical trade‑offs and inform best‑practice guidelines for future projects.
Speaker: Nihar Desai
What challenges have industry practitioners faced regarding inclusivity at the dataset layer or application layer, and how have they been addressed?
Capturing industry pain points helps align research, policy, and tooling with real deployment needs.
Speaker: Nihar Desai
How can innovation in speech models and datasets be balanced with legal caution concerning copyright, privacy, and data governance?
Balancing rapid development with compliance is essential to avoid legal risks while fostering open innovation.
Speaker: Nihar Desai
What day‑to‑day challenges arise in evaluating ASR systems for Indian languages, and how might these challenges be resolved or mitigated in the future?
Improved evaluation methods are needed to reflect linguistic variability and ensure reliable performance metrics.
Speaker: Nihar Desai
From a legal standpoint, how should subjective evaluation outcomes be handled in procurement decisions, dispute resolution, and evidentiary standards for AI outputs?
Clarifying legal treatment of subjective AI assessments will support fair contracting and judicial review.
Speaker: Nihar Desai
What open points, arguments, or calls to action should the ecosystem prioritize to advance speech models and datasets?
Gathering community‑wide input can shape future research agendas, standards, and collaborative initiatives.
Speaker: Nihar Desai
What specific safeguards and end‑use considerations are required when deploying open‑source speech datasets and models?
Tailored safeguards ensure that open resources are used responsibly across varied applications such as hate‑speech detection versus translation.
Speaker: Nihar Desai
How can India develop a national‑level evaluation framework and annual benchmarking process for voice technologies across its many languages and dialects?
A coordinated national benchmark would drive continuous improvement and comparability across stakeholders.
Speaker: Nishant (participant)
Should a unified leaderboard (e.g., under Bhashani) be created to evaluate models across languages and dialects, and how should it be structured?
A single, comprehensive leaderboard would foster healthy competition and collaborative progress in multilingual ASR.
Speaker: Prasanta Ghosh
What multi‑layered evaluation metrics (beyond word error rate) are needed to capture both objective and subjective performance of ASR systems in Indian contexts?
Current metrics miss linguistic variability; richer evaluation would better reflect real‑world usefulness.
Speaker: Prasanta Ghosh
What documentation practices, privacy‑enhancing techniques, and governance structures are required to ensure legal compliance throughout the voice data lifecycle?
Robust documentation and privacy safeguards are foundational for trustworthy, law‑compliant AI ecosystems.
Speaker: Thomas J. Vallianeth
How can sustainable open‑source infrastructure and governance models be established to support the long‑term viability of the voice technology ecosystem?
Ensuring funding, maintenance, and community stewardship is vital for open resources to remain usable over time.
Speaker: Harleen Kaur
What policies are needed to treat foundational speech datasets as public goods, including mechanisms for funding, convening, and supporting non‑commercial languages?
Public‑good treatment can unlock resources for under‑served languages and promote equitable AI access.
Speaker: Harleen Kaur
How can Responsible AI (RAI) practices be embedded throughout the development lifecycle, including community consent, privacy, and post‑deployment monitoring?
Integrating RAI at every stage reduces bias, misuse, and builds public trust in voice AI applications.
Speaker: Harleen Kaur

Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.