Open Forum #73 Indigenous Peoples Languages in a Digital Age
26 Jun 2025 16:00h - 17:00h
Open Forum #73 Indigenous Peoples Languages in a Digital Age
Session at a glance
Summary
This panel discussion at the Internet Governance Forum focused on barriers to indigenous language technology and AI uptake, held during the International Decade of Indigenous Languages. The session brought together experts from various backgrounds, including representatives from the Sámi Parliament, UNESCO, Meta, and academic institutions, to address challenges facing indigenous languages in digital spaces.
The discussion revealed that while language technology for indigenous languages exists and is technically feasible, the main barriers are not technological but structural and systemic. Platform owners often make it difficult or impossible for indigenous communities to implement their language tools in mainstream applications and systems, even when the technology is available. This creates a significant gap between what is technically possible and what communities can actually deliver to their users.
Several key challenges were identified, including limited digital infrastructure, lack of written language systems for oral traditions, restrictive data protection regulations that complicate data collection from indigenous communities, and the dominance of global languages in online content. The panelists emphasized that large tech companies often don’t see indigenous languages as profitable markets, leading to their exclusion from digital platforms.
The discussion highlighted the importance of community involvement and data sovereignty, with speakers stressing that indigenous peoples must be partners and co-creators rather than passive users in technology development. The principle of free, prior, and informed consent was emphasized as essential for ethical data collection and use.
Meta’s representative presented several initiatives, including open-source AI models and translation tools for 200 languages, demonstrating how open-source approaches can enable communities to adapt and control their own language technologies. The session concluded with UNESCO’s call for a mindset shift toward viewing language rights as human rights that must be respected in digital spaces, emphasizing the need for inclusive, community-driven innovation rather than top-down technological solutions.
Keypoints
## Major Discussion Points:
– **Barriers to Indigenous Language Technology Access**: Despite existing language technology capabilities, indigenous communities face significant obstacles in implementing and distributing their language tools due to closed platforms, restrictive policies from major tech companies, and lack of accessible integration pathways for minority languages.
– **Data Sovereignty and Community Control**: The tension between needing large datasets to train AI models for indigenous languages while ensuring communities maintain ownership and control over their linguistic data, with emphasis on free, prior, and informed consent principles rather than simple permission-seeking.
– **AI’s Potential vs. Risks for Indigenous Communities**: AI presents opportunities to bridge equity gaps in education, healthcare, and adaptive learning for indigenous peoples, but also risks widening disparities if these communities are excluded from AI development and implementation processes.
– **Open Source vs. Proprietary Solutions**: The advantages of open-source AI models and technologies that allow communities to customize, refine, and maintain control over their language tools, contrasted with the limitations of closed, proprietary platforms that restrict community agency.
– **Need for Systemic Change Beyond Technology**: Recognition that the challenges aren’t primarily technical but structural, political, and ethical, requiring a fundamental mindset shift from treating indigenous languages as niche markets to recognizing language rights as human rights integral to digital inclusion.
## Overall Purpose:
The discussion aimed to examine barriers preventing indigenous and minority language communities from accessing and benefiting from language technology and AI, while exploring solutions for more equitable digital inclusion during the UN International Decade of Indigenous Languages.
## Overall Tone:
The discussion maintained a collaborative and solution-oriented tone throughout, with speakers demonstrating mutual respect and shared commitment to indigenous language rights. While participants acknowledged serious challenges and historical injustices, the tone remained constructive and forward-looking, emphasizing partnership, community empowerment, and the urgent need for systemic change in how technology platforms approach linguistic diversity.
Speakers
– **MODERATOR**: Session moderator (role unclear from transcript)
– **Sjur Norstebo Moshagen**: Head of Sámi language technology work at the University of Tromsø, panel debate moderator
– **Ole Henrik Bjorkmo Lifjell**: Member of the Governing Council of the Sámi Parliament
– **David Castillo Barra**: International consultant specializing in promotion of multilingualism, member of the Secretariat for the International Decade of Indigenous Languages at UNESCO
– **Lars Ailo Bongo**: Professor in health technology at the Department of Computer Science at the University of Tromsø, adjunct professor at the Sámi University College heading the Sámi AI Lab
– **Outi Kaarina Laiti**: Computer game researcher, designer, and media education specialist from the National Audiovisual Institute of Finland, blends Sámi culture with tech and education
– **Valts Ernstreits**: Livonian language activist developing digital tools for endangered languages, works at the University of Latvia Livonian Institute, focuses on global digital inclusion policies
– **Aili Keskitalo**: Former Sámi Parliament president, indigenous rights advocate focusing on climate and just transition in Sápmi, works for Amnesty International in Norway
– **Kevin Chan**: Works at Meta on global digital policy to empower indigenous languages online
– **Tawfik Jelassi**: UNESCO official (specific title not mentioned in transcript)
– **Audience**: Audience member, Henry Wang from Singapore IGF, founding member of Singapore Internet Governance Forum, co-founder of LingoAI
**Additional speakers:**
None identified beyond the speakers names list.
Full session report
# Indigenous Languages in the Digital Age: Barriers to Technology and AI Uptake
## Executive Summary
This panel discussion at the Internet Governance Forum examined the critical barriers preventing indigenous and minority language communities from accessing and benefiting from language technology and artificial intelligence during the UN International Decade of Indigenous Languages. The session brought together representatives from the Sámi Parliament, UNESCO, Meta, and leading academic institutions to address the challenges facing indigenous languages in digital spaces.
The discussion revealed that while language technology for indigenous languages is technically feasible, the primary barriers are structural, political, and ethical rather than technological. Platform owners often create obstacles for indigenous communities attempting to implement their language tools in mainstream applications, creating a gap between what is technically possible and what communities can actually deliver to their users.
## Opening Context and Moderation
**Sjur Norstebo Moshagen** from the University of Tromsø served as moderator, with **David Castillo Barra** from UNESCO’s Secretariat for the International Decade of Indigenous Languages as online co-moderator. The session opened with **Ole Henrik Bjorkmo Lifjell** from the Sámi Parliament setting the context for indigenous language challenges in the digital age.
## Key Presentations
### Sámi Parliament Perspective
Ole Henrik Bjorkmo Lifjell emphasized the political and rights-based dimensions of language technology access, highlighting how indigenous communities face systematic exclusion from digital platforms and services.
### Academic Research Insights
**Lars Ailo Bongo**, Professor at the University of Tromsø and head of the Sámi AI Lab, discussed AI’s potential to bridge equity gaps in healthcare and education for indigenous communities. He noted that “AI can bridge maybe the most important equity gap that indigenous people are exposed to, which is the lack of experts in fields like medicine or education that has the language and cultural knowledge needed to understand and provide equitable services.”
However, Bongo also highlighted regulatory challenges, explaining that “Indigenous people face a dilemma as data subjects requiring extra protection under GDPR, yet needing data collection to ensure AI works equitably for minorities.” He proposed regulatory sandboxes as a potential solution for ethical data collection.
### Educational Technology Integration
**Outi Kaarina Laiti** from Finland’s National Audiovisual Institute described Finland’s 10-year experience with programming education, including the development of Sámi programming guides and media archives for speech recognition training. She mentioned ongoing projects including the Sámi Game Jam and extended reality initiatives since 2018, while noting that questions about “how to teach programming in Sámi languages and what are the cultural aspects of computing” remain unresolved.
### Endangered Language Perspectives
**Valts Ernstreits** from the University of Latvia Livonian Institute provided insights from his work with the Latvian-Indigenous Livonian population and global digital inclusion policies. He emphasized that technology currently caters primarily to the top 200 languages globally, leaving the vast majority of languages in secondary positions.
### Rights-Based Framework
**Aili Keskitalo**, former Sámi Parliament president and current Amnesty International advocate, provided a powerful rights-based perspective. She noted that “over 98% of the world’s languages lack basic digital tools, creating a threat of digital extinction rather than just a gap.” Keskitalo warned that “AI is not neutral and can replicate colonial logics if indigenous peoples are not involved from the beginning as rights holders, not just users.”
She called for “the shift from seeking permission to entering true partnerships with indigenous peoples as co-creators, applying free, prior and informed consent principles.”
### Industry Perspective
**Kevin Chan** from Meta outlined several company initiatives, including:
– Facebook translation capabilities for Inuktitut (developed over 5 years)
– The “No Language Left Behind” translator supporting 200 languages
– The Language Technology Partnership seeking community collaboration
Chan explained that the partnership seeks collaborators who can provide speech recordings with transcriptions (requiring about 10 hours of recordings) or written text samples to build new open-source speech technologies. He argued that open-source AI technologies “can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models.”
## Platform Restrictions and Technical Barriers
Moshagen articulated a key insight: “platform owners make life hard for most of the world’s languages, but probably mostly without realizing it. I don’t think there’s bad intent behind it. It’s just ignorance or negligence.” This observation highlighted how technological barriers are often artificial constructs created by platform policies rather than genuine technical limitations.
The panelists agreed that language technology for indigenous languages often works effectively in controlled environments but cannot be delivered through the applications and systems where users actually want to employ them due to platform restrictions and closed system architectures.
## Audience Engagement
**Henry Wang** from the audience raised an important question about alternative approaches, specifically mentioning the SOLID protocol and LingoAI as potential solutions for data ownership issues. This intervention highlighted ongoing technical discussions about decentralized approaches to language technology.
## UNESCO’s Global Vision
The discussion concluded with **Tawfik Jelassi** from UNESCO presenting the organization’s comprehensive vision for digital language equality. He emphasized that “indigenous communities must be central to technology design, development and governance, with their knowledge systems essential for ethical digital futures.”
Jelassi mentioned specific initiatives including the Mayan Language Preservation and Digitalization Project with Masterwords, and UNESCO’s Global Roadmap for Multilingualism, which aims to ensure that all language communities can thrive in the digital age with technology that is multilingual by design.
He concluded with a quote from Nelson Mandela: “If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart,” emphasizing the importance of building “a digital future that requires linguistic justice, cultural dignity and inclusive technology that speaks to hearts through indigenous languages.”
## Key Themes and Challenges
### Data Sovereignty and Community Control
A fundamental theme was the need for indigenous communities to maintain control over their linguistic data and be involved as co-creators throughout technology development processes, not merely as data sources or end users.
### Regulatory Complexities
The discussion revealed tensions between data protection regulations like GDPR and the EU AI Act and the practical needs of indigenous language AI development, where enhanced protection requirements can create barriers to the data collection needed for effective AI systems.
### Rights-Based Approach
Multiple speakers emphasized the need to recognize language rights as human rights in digital spaces, shifting focus from treating indigenous languages as optional features toward recognizing them as fundamental rights that platforms should support.
### Open Source Solutions
Several panelists expressed support for open-source approaches as a way to provide communities greater control over their language technologies while leveraging existing technological infrastructure.
## Conclusion
The panel demonstrated broad agreement on fundamental principles while revealing the complex technical, legal, and ethical challenges that must be addressed to ensure indigenous languages can thrive in the digital age. The discussion highlighted both the potential of AI and language technology to support indigenous communities and the risks of further marginalization if these communities are not involved as rights holders and co-creators in technology development.
The session concluded with calls for sustained commitment to fundamental changes in how the technology industry approaches linguistic diversity, moving beyond market-driven approaches toward rights-based inclusion that recognizes indigenous languages as essential components of human cultural heritage.
Session transcript
MODERATOR: ♪♪ ♪♪ ♪♪
Sjur Norstebo Moshagen: Hello, and welcome, everybody, both here and online. We’ll start by getting some very nice words from Ole-Henrik Björkun Liefjell, who is a member of the Governing Council of the Sámi Parliament. Please.
Ole Henrik Bjorkmo Lifjell: Buribije Bores. Dear participants, I have the honour of opening and welcoming the people present here and the online participant to this panel discussion. The panel discussion that will highlight the importance of the subject of indigenous languages, technology, and AI. To start with, let me express thanks to UNESCO and IDIL for putting these subjects on the agenda and promote visibility of the international decade of indigenous languages. Indigenous and minority communities face barriers as there is limited digital infrastructure and digital tools supporting use of our languages. Large tech companies may not see indigenous languages as profitable markets, and most online content is dominated by a handful of global languages. Many indigenous communities have oral traditions as cultural preservations, and lack of written language is making digitization complex and sometimes inappropriate without community consent. To overcome these barriers, we need to follow up with some following actions. We need to reduce language loss and revitalize indigenous languages, and the technological development for indigenous languages needs to ensure the importance of digital inclusion of indigenous languages also in digital platforms and AI. To promote policy is also something we need to do, policy that use human rights principles and take accountability by use of national laws which will regulate and secure that indigenous communities have control and management of linguistic data collection that will benefit our own communities. AI-generated data innovations needs to be used in a non-discriminative way and respect indigenous cultures as well. We need to initiate for further collaborations and foster dialogues with big tech companies developers to include digital tools and language technologies for indigenous communities and speakers of indigenous languages. We need to remind each other that no language is too small to matter, and by elevating the challenges faced by indigenous and minority language communities, the communities are helping to pave a path toward a more equitable digital future for everyone. This panel debate is now opened, and I will encourage participants to establish connections to further exchange on the ongoing subject race under this Internet Governance Forum in the framework of the decade. Thank you very much.
Sjur Norstebo Moshagen: Thank you very much. My name is Sjur Nørsteberg Mosagen, and I’m going to head this panel debate. In my daily life, I’m heading the Sámi language technology work at the University of Tromsø, but today we’re going to discuss barriers to indigenous language technology and AI uptake. And to help me with this, I have online David Castillo Barra as a co-moderator for the online participants. He’s an international consultant specializing in the promotion of multilingualism and currently serves as a member of the Secretariat for the International Decade of Indigenous Languages at UNESCO. He supports initiatives related to UNESCO’s recommendation on multilingualism in cyberspace with a strong focus on fostering linguistic diversity in digital space. And David, maybe you would like to say a few words to present yourself.
David Castillo Barra: Thank you. Good afternoon. Am I audible? I don’t know if you can hear me. Thank you very much. Hello. Good afternoon and greetings from Paris. Thank you, Jules, for the presentation. I am here from the Secretariat of the International Decade of Indigenous Languages at UNESCO. I’ll be your online moderator today, so feel free to share your questions in the chat for those joining remotely, and I’ll pass them to our panelists during the question and answer session. So thank you very much. I would like to pass the floor again to Jules to introduce our panelists. Thank you.
Sjur Norstebo Moshagen: Thank you very much, David. Yes, the panelists of today. On site, we have Laisa Ilobongo, who is a professor in health technology at the Department of Computer Science at the University of Tromso, as well as an adjunct professor at the Sámi University College heading the Sámi AI Lab there. Online, we have Ohti Laiti from the National Audiovisual Institute of Finland. Ohti is a computer game researcher, designer, and media education specialist blending Sámi culture with tech and education. Then, again, on site, we have Valts Enstrids, who is a Livonian language activist developing digital tools for endangered languages, and he is working hard to shape the global digital inclusion policies. Then, the last one on site is Eilidh Keskitalo, who is a former Sámi Parliament president and an indigenous rights advocate now focusing on climate and just transition in Sápmi, and working for Amnesty International in Norway. And finally, online, we have Kevin Chan, who is working at Meta on global digital policy to empower indigenous languages online. So that’s our panelists for today. And before they are giving the word, I will say a few words on the topic of today. Sorry about that. So, a starting point for this could be the global roadmap for multilingualism in the digital era that UNESCO is working on right now. They have a draft, and I’ll quote a few sentences from the introduction. And remember, this is only a draft, but I think it’s quite well formulated and goes to the heart of the topic of today. The global roadmap for multilingualism in the digital era provides a strategic framework for advancing language technologies, promoting linguistic diversity and multilingualism, and ensuring that all language users from all language communities thrive in the digital age. Recognizing that language rights are integral to human rights, the roadmap aims to empower every individual to use and preserve the language in digital spaces. And the question today is, how is that, what’s the actual status, what are the problems, what are the obstacles to actually do what they are trying to do in that part? So, I will say a very few words on this. I’ve been working on language technology for the Sámi languages for the last 20 years, and what we have seen is that the conditions for third-party languages, language technology, is very different from the first-party languages. So, tools by Apple and Microsoft are treated very differently from tools by everyone else. So, there are serious problems for these tools, and often they are completely blocked. So, independent localization, for example, is also not possible, or it might not be accessible, or if it’s possible or accessible, it’s not distributable. There are no platforms for providing translations to a piece of software without asking. and getting permission from the original developer. And AI for indigenous languages and minority languages, that’s a quite open question. It’s open for probably many languages at the moment, but what you have seen so far is that bad output is dominating for these languages, partly due to lack of data, but also partly because of lack of community involvement and lack of quality assurance and testing and evaluation. And a major question in this discussion is how can one add one’s own language to models from big technology? How can Sámi or any indigenous language be added to the models from open AI, from Apple, from Microsoft, whoever? So what we can say is that, as I said, we have been doing this for 20 years. We know the technology, we know we can make it work on the technological level. What we cannot always do is deliver the tools in the apps and the systems and the context where users want to use them. That’s the major problem. Here are some examples here that we have experienced. I’m not going to spend much more time on that, but just one short one is spellers approving tools in online office applications. We have no possibility to install these tools using so that they behave as people expect them to. So the conclusion is that language technology for indigenous languages is often not possible, even though the technology is there and we know the technology. That’s not the issue. Platform owners make life hard for most of the world’s languages, but probably mostly without realizing it. I don’t think there’s bad intent behind it. It’s just ignorance or negligence. So we need a new approach to how human languages are included and how they are approached in the digital world. And that’s what we are going to discuss today. So then next up is Lars Ailo-Vongo.
Lars Ailo Bongo: Thank you. And thank you for inviting me to give this talk. So I’m going to talk about future issues that may sort of hinder the use of indigenous language and indigenous AI. So AI, it has a great potential to bridge maybe the most important equity gap that sort of indigenous people are exposed to, which is the lack of experts in fields like medicine or education that has the language and cultural knowledge needed to sort of understand and provide equitable services. So for instance, in psychology, there are very few tests that are normed on indigenous minority languages. So the tests basically don’t work well for indigenous people. So AI has the potential to provide something where there is nothing there before. And also in education, AI has a great potential to provide adaptive learning, which is very important for minority language speakers because the level of language knowledge often has a greater variation than the majority languages. So there’s a great potential for AI to sort of bridge this equity gap. But then again, if indigenous peoples are excluded from using AI, then we are at the risk that this equity gap will just widen when the majority people start using AI for their health service and their educational service. It’s very important that indigenous people are included in these new AI services. And luckily, this is regulated somewhat by law. So this is the report from the EU Act, EU AI Act, which basically says that it’s not allowed to discriminate minorities such as indigenous people. So if there is an educational service or a health service provided, it should work as well for minority people as for the other people. However, there is one big challenge, which is that the indigenous people and other minorities are considered data botanists, considered a special category. So this requires extra strong data protection. And this is, for instance, regulated by the GDPR law that says that it’s not allowed even to collect this data unless you have a really good purpose. But also, the EU Act says that it is allowed to actually do this, to collect ethical data if the purpose is to prove that this AI works as well for indigenous people as for other minorities. And I just want to illustrate the dilemma that the indigenous people and also the AI providers are facing. So let’s say that we want to do adaptive learning, which is maybe the application that is highest on the priority list of many indigenous people. So in order to do that, AI can help. But to do that, you need to build this AI and adaptive learning. One important component of that is cognitive tests. And this includes IQ tests. But if you want to do an IQ test that is equitable and works well for minority languages and cultures, you need to build that using the minority language and culture in mind. And that means you need to collect data from these minorities. And let’s say that these are indigenous children, then you need to collect basically indigenous IQ tests from indigenous people. And this is, of course, very controversial, because being an indigenous person myself, I know that we have historically been exposed to basically racist research where they sort of attempted to show that indigenous people are less intelligent. But I guess that if we want this to exploit all the opportunities that AI gives, including in the educational field, we must basically now start collecting this type of data. But luckily, we can do this in a much more ethical way than was done in the Dark Ages. So we can build some regulatory sandboxes that ensures that this data collection is done in an ethical and safe manner. And I think this is really important. We need to really start working on this in order to not leave the indigenous and other minority people and languages behind when the new AI tools are going to be used in important services like health and education. Thank you.
Sjur Norstebo Moshagen: Thank you very much, Lars. Next one out is Oti Laitti. So. Yeah. Please go ahead.
Outi Kaarina Laiti: Thank you, My slide somewhere. Is this one? Probably we need to share this Zoom. It should be shared. I can see only myself. Okay, on the screen shared in the stream here, it’s both you and the slide. Yes, I can see it on mine too. Okay, now I can see it, thank you. I’m going to dive in and thank you for having me as an indigenous woman. I come from the margins and it’s always a pleasure to be talking about computing. Programming has not been my passion. Games are, but since I was like three or five or something like that, I wrote my first line of code because I wanted to play games like Commodore 64 was a huge hit in 1980s. So when Finland introduced 10 years ago programming as a part of our basic education, it was the starting point of doing programming research because we have Sámi people living in North Finland and no one knew how to actually do this. The questions like how to teach programming in Sámi languages, what are the cultural aspects of computing, they still exist after 10 years of educating children in basic education. And this change was huge, like all teachers in all levels should teach programming, from crafts to gym teachers, they should all do it and starting from grade one. And I guess it has been 10 years, so we have like one generation of Sámi basic education programmers ready, or maybe they’re not. But anyway, can I get the next slide? And then the games. Nearly all the games I know that go under the digital Sámi game umbrella, they are all for education, and in that language education especially. We call this the serious games. And then we have a lot of developing content in games. This can go under the same umbrella of Sámi games when we are developing content in platforms like Second Life, Minecraft and so on, which I call the indigenous metaverse. It’s growing rapidly and most of these platforms are private-owned. Then we have semi-private platforms like, for example, in universities. York University has its own indigenous metaverse in development, and Helsinki University did Serendip, which is not indigenous, but it has some indigenous content. I have done extended reality projects since 2018 in Sámi Game Jam. The major issue is that you cannot actually use extended reality in language education, because we don’t have the tools to have discussions in virtual reality, if we don’t use like voice over IP or something like that. And it’s easier to use non-human centered design in games for multiple reasons, but ethical questions are one, for example, representation. If I’m doing non-playable or playable Sámi characters, what should I represent? And what are they talking if they are talking and how they are talking? These are all ethical questions. And next slide, please. There has been some progress because Finland introduced this programming in basic education. For example, National Audiovisual Institute has published these guides for media education and programming in three Sámi languages spoken in Finland. The picture is actually from a Skolt Sámi programming guide, which is the coolest thing I have seen for a while. We have huge Sámi media archives that has been used to train like automatic speech recognition tools. But the problem is that we are missing the text equivalent that we could actually use. We should have archives combining both, like the textual version of speech and the actual speech. So this development is quite slow. Thank you.
Sjur Norstebo Moshagen: Thank you very much. The next speaker is Valts.
Valts Ernstreits: Thank you for reminding me to this panel, which is really crucial for the IJF as well. Just a couple of words about my background. I represent originally Latvian-Indigenous Sámi population. Next year we will celebrate 35 years since official recognition in Latvia. I have been active in promoting Livonian issues for past 30 years. But last six years I have been working at the University of Latvia, Livonian Institute, which is specially established. One of the key action areas that we work with is building digital resources and looking to approaches for extremely under-resourced, scattered data conditions. Because the Livonian community is actually very small, we have less than 20 speakers in general. So we have to find the ways. This is pretty logical that we have been also recently quite active in maybe more global initiatives. I wanted to present those instruments that currently exist supporting developments in the digital area for Indigenous languages. As you all know, this is the International Decade of Indigenous Languages. Just last year there was a specially designated ad hoc group established on digital equality and domains. We were sure that there were also participants. This year there was one very interesting initiative that went out, which is a global survey on Indigenous languages. Which is closing in a couple of next months, which provides both data or perception of what is the actual state of Indigenous languages globally in the digital area. But also motivates those participating to think about technologies and issues that they have on the path to the digital equality. In February in Paris, a conference took place, Language Technology for All 2025. From that conference grew out new, maybe the freshest UNESCO’s initiative, which is Global Roadmap for Multilingualism in the Digital Era. This is the document that might define the future for languages, and especially for digital languages. Because it envisions the future where equal opportunities entering digital domains is ensured for all languages. Currently technology caters mostly those top 200 languages of the world. But the majority of the languages are somewhere in second row or in last row, as maybe some. And majority of those languages are Indigenous languages, so this is mechanism that in most work addresses the Indigenous issues. Very shortly, summarizing up the roadmap. Currently there is a roadmap consultation process, so you can easily look up at UNESCO’s webpage and take part of it. But summarizing up, there are three key moments in this roadmap. There are input issues, output issues, and everything regarding process. By input issues, it basically addresses the ability to produce and obtain digital data, which is kind of a precondition for any language to enter. So before we start the technology, we have to start with being capable of producing anything in digital format. Whether it’s sound data for spoken languages or written data, and there are lots of analog challenges before that. And also lots of restrictions. So there are countries that, for example, do not allow digital usage of certain languages or languages that simply don’t have any writing systems or access to technology. So this is one part. The second part is output. That was what Shur was talking about. So imagine if we have ability to produce digital data, we even have technologies like Sami languages, like Livonian, but we are not able to use them. We are not able to get them in. And not only on daily products, but also on cloud computing, on such products like games and educational instruments. So this is another aspect that we need to tackle. And basically what we want to achieve as the end goal is that technology is multilingual by design. So whatever language there is, any technology is adaptable to be used by users of that language. And regarding process, there are kind of in the middle of that roadmap sits the idea that communities, language speakers, have to be involved in one or another way in all the stages of technology development. This is not only an issue about how you handle data, but this is also about how technology is developed. This is a question of whether technology is published, if it doesn’t meet, for example, quality standards of the community and many more. Thank you.
Sjur Norstebo Moshagen: Thank you very much, Valt. Then Aili Keskitalo, please.
Aili Keskitalo: Thank you for the floor. And I’m here today as a Sami language user and as a mother raising now young women in a language that has often been pushed aside. in public systems, in education, and increasingly in technology. But I’m also here as an advocate for Indigenous people’s rights and human rights, believing that technology should serve rights and not markets. For us, it’s not just about innovation, it’s about justice. It’s about the right to exist fully in our own language, not only in traditional settings, but also in e-mails, in voice assistants, in learning apps, and eventually in AI systems. The ability to use your own language, including in digital spaces, is essential for dignity, for cultural continuity, and for meaningful participation in society. Today, over 98% of the world’s languages lack basic digital tools. And this is not a gap, it’s a threat. It means that unless we act, our languages risk going digitally extinct. But we see the potential, we have heard about it today. Sámi institutions, like the Sámi Parliament’s joint project, MIA Techno, are taking steps to develop language technology on our own terms, with open source tools, ethical frameworks, and strong demands for state responsibility. Still, we face barriers, and we have heard about them already today. Closed code from big tech, lack of funding, and not enough access to data to train the systems that we need. At the same time, as Lars Ailo already explained us, we must be careful. AI is not neutral. It can replicate colonial logics if we are not involved from the beginning, as rights holders, not just users. Language is power, and in this digital age, the right to speak your language must include the right to shape the tools that carry it forward. That is my message. Kiito. Thank you.
Sjur Norstebo Moshagen: Thank you very much, Aili. And the last speaker before we approach the questions, that is Kevin Chan from Meta. Please go ahead.
Kevin Chan: Thank you very much. Sure. It’s good to see you again. Maybe we just move to the first, the next slide, if you will. If we’re able to. Oh, there we go. So I wanted to start by just sharing that, you know, this is obviously and has been referred to by a few other people on the panel. Obviously, an important decade to be thinking about these very important issues. We are in the decade for indigenous languages. And at Meta, we have been putting together some initiatives, working closely with indigenous peoples and with UNESCO and other language partners to think through how we can help support with, in particular, some of our open source technologies with AI. And the previous panelists talked about a bit about open source versus closed source technologies. Open source technologies effectively are ones where we have built some kind of AI model, but then we make it freely available to anybody else who wants to use it. And what that allows you to do is you end up taking the model. You can refine it. You can fine tune it. You can add additional functionality and features to it. And then you own what it is afterwards. And so we do believe, I think, as was previously mentioned, we do believe that open sourced AI technologies can be a very valuable technology in this context. So I really want to just leave me just three things to start the conversation. One is an initiative that we helped drive in Canada with Nunavut Tungavik Incorporated, which is an entity up in Nunavut, which is Canada’s sort of Arctic territory, to translate the platform into Inuktitut. We also launched last year an online language translator for 200 languages with the help of UNESCO and Hugging Face, which is powered by Meta’s open source AI model called No Language Left Behind. And then there’s also a new language technology partnership that we recently announced as well. So if we can move to the next slide, please. I just wanted to, again, just call out the initial kind of one of the initial projects we did, which was announced, I guess it was maybe two years ago. We launched, again, with NTI’s help. And it really was a long kind of period of collaboration together because we wanted to do this properly. We want to do this in a way where we were welcomed by the community to do so. And, of course, the community had the expertise and we obviously didn’t. We want to make sure that we did this properly. And it did take about five years. But we were very, very pleased to be able to bring at least the desktop version of Facebook in a, I think, in a kind of minimal way because I wouldn’t say that everything was translated. But we had some key parts of the platform translated into Inuktitut. We were very pleased that the governor general of Canada, Mary Simon, who herself is Indigenous and Inuk, she shared the good news on Facebook when we launched that day. And very pleased that our friends at UNESCO were very supportive of us championing and making and helping to drive this kind of initiative during the international decade. The next slide, if I may. This is just a video. I think we can play it. Can you hover around the video or move the cursor? Oh, there we go. It’s a non-audio video. It just kind of shows you what this next initiative is, which is the No Language Left Behind online translator. Again, going 200 languages in terms of translation. It’s text-to-text. And it does include many, of course, not nowhere near comprehensive of the thousands of Indigenous languages that exist. But it does, among the 200, many of these languages are Indigenous languages. And you can see in the video, you select your origin language. You select the kind of language you intend to translate to. And you just put the text in. And there’s another window below that gives you the translated output. This, again, is something that is open source. And so folks are able to, for example, access the model on places like GitHub and then iterate on it. And so there is potential and opportunity to expand the language set to include other Indigenous languages. And you can do this freely. The technology is offered out to the world and to the community freely to do that. And then maybe the last slide, if I can. So this is the Language Technology Partnership, which is something we announced actually in Paris just earlier this spring in February. And it is a project that we’re once again working with partners around the world on. And that is to try to really push the frontiers of language technology, in particular trying to help support low-resource languages. And so what we have been seeking collaboration on, and we could obviously only do this with the agreement of partners, is to have made available different portions of data for languages to help train a model that we hope will be able to be quite powerful in terms of translating and in terms of transcribing different languages that are currently maybe not as supported as we would like. So what we are looking for are partners that can provide 10 hours of speech recordings. with transcriptions or some amount of written text, and here we’ve said we’ve specified sort of 200 plus sentences. And what we hope to do in the coming months with these partnerships is to again build new open source speech technologies, and of course our commitment would be that if we are successful in making some of these breakthroughs in terms of translation and transcription, we would want to then make these technologies freely open to the global language community for them to build applications and further the research. And if there’s interest in this, of course, please do feel free to reach out to me and I will try to do my best to connect you with the right teams that are looking at this. You can also, of course, search for this online. I think there is a portal where you can learn more and submit information if there is interest. So thank you very much, and I’ll pause here.
Sjur Norstebo Moshagen: Thank you very much, Kevin. Then all the panelists have introduced their slides, and we can go on to the questions. With over 7,000 languages in the world, it’s clear that no platform can realistically support them all by themselves. Platform owners constrained by security concerns and limited resources often end up centralizing control over language availability, leaving many communities without access to their own languages in the digital space. How can you shift this mindset, and what would it take for them to open up the platform so communities can manage their own languages in the digital space? No questions asked. I was thinking that maybe Lars Ailo could give the first comment on this one. So maybe I’m approaching this issue a bit differently
Lars Ailo Bongo: from the other panelists, because I’m not interested in building the models and thereby not having that need to sort of have a platform to run this, but more the AI applications that will use the technology that is provided, hopefully, by this kind of platform. So my concern is more on the sort of the practical issues of being allowed to do this, and that we need to address also the challenge of not just building the models, but also the applications, and especially the high risk that are useful in or will be used in education, health service, and other important
Sjur Norstebo Moshagen: public service. Thank you very much. Then Oti, what is your take on this question?
Outi Kaarina Laiti: I have a short answer to this question, and I’m speaking from the perspective of, for example, basic education, where we see language as a human right, of course. So we should shift focus from a feature or localization or a liability towards that language is a human right, and it is a human towards that language is a human right, and it is that on platforms as well. So that strengthens the platforms when, this is a philosophical question, so this is a philosophical answer, but we need to kind of start seeing the possibilities in this, and instead of and not talk about localizations anymore.
Sjur Norstebo Moshagen: Thank you very much. And Kevin, what do you think about what Oti just said?
Kevin Chan: Can you hear me? I just had to unmute it. It sounds like it’s okay. Yes, you can hear me? We hear you. Great. Yeah, I mean, I actually agree very much with what was expressed with the panelist intervention, which is that it may not be necessarily about the models themselves, but more about the application layer. There is, I think, again, going back to what I had mentioned about open source models, and there are many, Meta makes some, but there are other companies, obviously, that make them as well. This is, I think, going to be a very important vector by which Indigenous communities, people who are very committed to supporting, protecting, and promoting low resource languages, this is a very important way, I think, by which you can actually see applications built on top, precisely because the models are free for people to use. And so, with the right amount of training and work to build applications on top of these models, you very much, I think, can get models that are conversant in different languages.
Sjur Norstebo Moshagen: Okay, thank you very much. I think due to time constraints, we go on to the next question. AI technologies still rely heavily on large volumes of text data, even as those requirements gradually decrease as technology develops. How can you ensure that AI is developed for Indigenous and minority language communities in a way that keeps data ownership and control of linguist data in the hands of those communities? And how can we ensure that AI-generated content is of such quality that it supports rather than harms the language and its speakers? Valts, what do you think about this? Yeah, I would probably take this question the same way I would
Valts Ernstreits: approach the previous one. So, this is basically a mind shift, because, well, there is this thing. So, in order to use a language in whatever technology, we need large amounts of data. And that data, and especially for small communities, it’s always hypersensitive. So, you don’t even need sensitive data to actually collide with GDPR, which is already there. But what is actually needed is this community that, what I mentioned previously, sort of community involvement in all stages of technology, because we need community contribution in order to get technology running. But at the same time, we need to make sure that technology that is produced, it is not harmful, it is ready, it corresponds for what the community needs. And so, there is no other way around. And this is not done by a legislation that much. This is really a mind shift, because we run, for example, with developers, even with academia, who should be kind of very well aware of issues. We run in those situations that we have to explain them that, well, why this is not working, like, why this is not okay. And we do need mind shift in listening to indigenous people in the whole stages of the process.
Sjur Norstebo Moshagen: Thank you very much. Eilidh, would you like to say something?
Aili Keskitalo: Yes. Yes, I would start with agreeing with Valtz on the demand of shifting of the mindset. And I think the shift will need to be from thinking about getting permission to entering into true partnerships with the indigenous peoples, with the language communities. And so, that the language users are not just passive users, but co-creators. And that would maybe build the trust that is needed for data collection. So, because it is, of course, about data sovereignty as well. And, well, the principles, when it comes to the principle often used in other contexts, when it comes to indigenous people’s rights, is the principle of free, prior and informed consent. And that should be used also when it comes to data collection and the application of that data. Thanks.
Sjur Norstebo Moshagen: Thank you very much. Time is running way too fast for us. So, I think we should see… How is it, David? Do we have any questions from the online audience?
David Castillo Barra: No, we don’t have any online questions. So, I think you were very, very clear. Thank you.
Sjur Norstebo Moshagen: Do we have any questions from the audience in the room? Yes, we have one. Please.
Audience: Dear distinguished speakers, I’m so excited because this panel respects the indigenous language and the cultures so much, and also we have META and to work with UNESCO and to protect the languages and the cultures. So I think there’s a strong conflict between the data ownership and the way to collect the data. So if we want to help the large language models work well for the indigenous languages, we have to turn out the data and then fine-tuning the large language model. So the contributor becomes the users, unfortunately. So that is because the traditional architecture of the Internet. So META has to be a centralized platform. So there’s no way to solve this, but there’s a new paradigm shift. A new protocol was invented fully, and this year it will be triggered. So with data ownership, apply for GDPR, and also this data can be collected in a way that is anti-digital colonization. So I’m Henry Wang from Singapore IGF. I’m the founding member of the Singapore Internet Governance Forum, and also I’m the co-founder for LingoAI. So LingoAI works with the founding father of the World Wide Web called Sir Tim Berners-Lee. So he invented HTTP, and our Internet became centralized. Then he felt sorry about this, then he invented a new protocol to correct the Internet. So the new protocol called SOLID, and LingoAI works with SOLID and MetaLife, and we can collect the data with data ownership owned by all the contributors and the users. And the datasets can also be used by authorization, by permission, by the large language models companies worldwide or locally. So local data is important because on-device models can work with local datasets, become everyone’s AI agent. It’s your personal agent. So we are working on this solution, and it’s also already available. For example, semi-contributors can all contribute data, but they control to part that they own by themselves. They can authorize to Meta, authorize to OpenAI, authorize to no aging large language model companies, but they keep the ownership and fully apply it for GDPR. So I’m so happy to join this, you know, as attendees of this panel. So my question will be, so if we have such solutions, if Internet have such protocols, are you willing and to try and to work and with this way and to help to protect the indigenous languages and the cultures based on the languages? Thank you.
Sjur Norstebo Moshagen: Thank you very much. Just a few seconds. Okay. The ADG would like from UNESCO would like to have some closing remarks. We might have time for one short question after that. We’ll take the closing remarks now. So please go ahead, Tawfiq Jalassi.
Tawfik Jelassi: Good afternoon to all of you, Excellencies, distinguished panelists, esteemed participants. As we come to the close of this important session, I would like first to express my sincere gratitude to all the speakers and participants for their substantial and insightful inputs which we had this afternoon. And I would like also to extend special thanks to Mrs. Stenson, the Minister of Local Government and Regional Development of Norway, who has always showed us commitment, engagement and support, especially in the context of the international decade of indigenous languages. The commitment and leadership of Norway has been instrumental in advancing our shared goal to safeguard and revitalize indigenous languages in the digital age. Also I would like to express my deep gratitude to the members of the international decade of indigenous languages, members of the ADG for their invaluable contributions, especially to the global survey which took place on indigenous languages. I’m also excited to see the survey’s findings and to explore how we can further collaborate with indigenous communities and nations worldwide to advance this vital work. The title of this afternoon’s session, It’s Not Just the Tech, reminds us of a fundamental truth. Technology alone cannot solve the challenges that we face. And yes, indigenous language technologies exist and AI holds transformative potential. However, if the systems in which these tools are, if the systems are not inclusive, do not respect cultural and linguistic rights, then the technology by itself is just another barrier instead of fully playing its role as a bridge between communities and cultures. I think we heard this clearly today. The barriers to meaningful uptake of indigenous language technologies are not technical. They are structural, they are political and they are ethical. From the spread of proprietary platforms to restrictive data protection regimes to the persistent exclusion of indigenous peoples from digital policymaking, these are the conditions that determine whether indigenous languages can truly thrive in cyberspace. At UNESCO, we stand with indigenous peoples to affirm the right to fully participate, to also have equal footing in digital space in their own languages. Indigenous communities must not only benefit from these technologies, they must be central to its design, its development and its governance. Their knowledge systems, their worldviews and linguistic heritage are not just valuable, they are essential to shape an ethical and inclusive digital future. This is the vision behind the international decade of indigenous languages, not just preservation, but true empowerment. We are proud to support projects like the Mayan Language Preservation and Digitalization Project in partnership with Masterwords. This project has created new talking glossaries, localized websites and a universal Mayan keyboard, now empowering millions of speakers of this language across the Americas. Still many challenges remain. AI systems continue to reflect linguistic hierarchies, data remains scarce or inaccessible and indigenous women and girls face barriers in accessing and shaping these technologies. We must address these gaps by investing in open, community-driven innovation and in promoting gender-responsive digital inclusion. As a next step, UNESCO invites you all to contribute to the Roadmap for Language Technologies, a roadmap which is now online for public consultation. Your contribution will help us shape this global process. In closing, let me share the wise words of Nelson Mandela, who said, quote, If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart, end of quote. Let’s work together to build a digital future that speaks not only to minds but to hearts through linguistic justice, cultural dignity and inclusive technology. Thank you.
Sjur Norstebo Moshagen: Thank you very much. And that’s the end of the panel discussion. Time is out. Thank you all participants and the audience. Thank you very much. Thank you.
MODERATOR:
Ole Henrik Bjorkmo Lifjell
Speech speed
119 words per minute
Speech length
355 words
Speech time
178 seconds
Indigenous communities face limited digital infrastructure and tools, with large tech companies not seeing indigenous languages as profitable markets
Explanation
Indigenous and minority communities encounter barriers due to insufficient digital infrastructure and tools supporting their languages. Large technology companies do not view indigenous languages as profitable markets, and most online content is dominated by a handful of global languages.
Evidence
Most online content is dominated by a handful of global languages, and many indigenous communities have oral traditions as cultural preservation with lack of written language making digitization complex
Major discussion point
Barriers to Indigenous Language Technology Access
Topics
Development | Sociocultural
Indigenous communities must have control and management of linguistic data collection that benefits their own communities, following human rights principles
Explanation
Policy development should use human rights principles and be supported by national laws that regulate and secure indigenous communities’ control over linguistic data collection. This ensures that data collection benefits the communities themselves rather than external entities.
Evidence
Need for national laws which will regulate and secure that indigenous communities have control and management of linguistic data collection, and AI-generated data innovations need to be used in a non-discriminative way
Major discussion point
Data Sovereignty and Community Control
Topics
Human rights | Legal and regulatory
Agreed with
– Aili Keskitalo
– Valts Ernstreits
Agreed on
Indigenous communities must have control and ownership over their linguistic data
National laws must regulate and secure indigenous community control over linguistic data collection
Explanation
There is a need for legal frameworks at the national level that will regulate and ensure indigenous communities maintain control and management over the collection of their linguistic data. This legal protection is essential to prevent exploitation and ensure community benefit.
Evidence
Policy that use human rights principles and take accountability by use of national laws which will regulate and secure that indigenous communities have control and management of linguistic data collection
Major discussion point
Regulatory and Legal Framework Challenges
Topics
Legal and regulatory | Human rights
Sjur Norstebo Moshagen
Speech speed
124 words per minute
Speech length
1401 words
Speech time
674 seconds
Platform owners make life difficult for most world languages through closed systems, often without realizing it due to ignorance or negligence
Explanation
Platform owners create barriers for the majority of the world’s languages through their closed systems and restrictive policies. This is typically not done with malicious intent but rather stems from ignorance or negligence about the needs of minority language communities.
Evidence
Tools by Apple and Microsoft are treated very differently from tools by everyone else, with serious problems including complete blocking, lack of independent localization possibilities, and no platforms for providing translations without permission
Major discussion point
Barriers to Indigenous Language Technology Access
Topics
Infrastructure | Legal and regulatory
Agreed with
– Aili Keskitalo
– Valts Ernstreits
Agreed on
Current technology systems create barriers for indigenous languages
Language technology for indigenous languages is often technically possible but cannot be delivered in the apps and systems where users want to use them
Explanation
The technological capability exists to create language tools for indigenous languages, as demonstrated by 20 years of work in this field. However, the main challenge is not the technology itself but the inability to integrate these tools into the platforms and applications where users actually want to use them.
Evidence
Examples include spellers and proofing tools in online office applications where there is no possibility to install these tools so they behave as people expect them to
Major discussion point
Practical Implementation and Solutions
Topics
Infrastructure | Development
The goal is empowering every individual to use and preserve their language in digital spaces as an integral part of human rights
Explanation
The UNESCO global roadmap for multilingualism in the digital era aims to create a strategic framework that recognizes language rights as integral to human rights. The objective is to empower every individual to use and preserve their language in digital spaces.
Evidence
UNESCO’s draft global roadmap states that ‘language rights are integral to human rights, the roadmap aims to empower every individual to use and preserve the language in digital spaces’
Major discussion point
Vision for Digital Language Equality
Topics
Human rights | Sociocultural
Aili Keskitalo
Speech speed
93 words per minute
Speech length
438 words
Speech time
281 seconds
Over 98% of the world’s languages lack basic digital tools, creating a threat of digital extinction rather than just a gap
Explanation
The vast majority of the world’s languages do not have access to basic digital tools and technologies. This represents more than just a technological gap – it constitutes an existential threat where languages risk becoming digitally extinct if action is not taken.
Major discussion point
Barriers to Indigenous Language Technology Access
Topics
Development | Sociocultural
Agreed with
– Sjur Norstebo Moshagen
– Valts Ernstreits
Agreed on
Current technology systems create barriers for indigenous languages
AI is not neutral and can replicate colonial logics if indigenous peoples are not involved from the beginning as rights holders, not just users
Explanation
Artificial intelligence systems are not neutral technologies and have the potential to perpetuate colonial patterns of oppression if indigenous peoples are not meaningfully involved in their development. Indigenous peoples must be recognized as rights holders with decision-making power, not merely end users of the technology.
Evidence
Language is power, and in this digital age, the right to speak your language must include the right to shape the tools that carry it forward
Major discussion point
AI Development and Indigenous Language Inclusion
Topics
Human rights | Sociocultural
The shift must be from seeking permission to entering true partnerships with indigenous peoples as co-creators, applying free, prior and informed consent principles
Explanation
Rather than simply asking for permission to use indigenous language data, there needs to be a fundamental change toward establishing genuine partnerships where indigenous peoples are co-creators of technology. This approach should be based on the principle of free, prior and informed consent that is commonly used in indigenous rights contexts.
Evidence
The principle of free, prior and informed consent should be used when it comes to data collection and the application of that data
Major discussion point
Data Sovereignty and Community Control
Topics
Human rights | Legal and regulatory
Agreed with
– Ole Henrik Bjorkmo Lifjell
– Valts Ernstreits
Agreed on
Indigenous communities must have control and ownership over their linguistic data
Disagreed with
– Lars Ailo Bongo
Disagreed on
Data collection approach and regulatory challenges
Lars Ailo Bongo
Speech speed
147 words per minute
Speech length
785 words
Speech time
319 seconds
AI has great potential to bridge equity gaps for indigenous people in fields like medicine and education where cultural and linguistic expertise is lacking
Explanation
Artificial intelligence could help address significant equity gaps that indigenous people face, particularly in areas like healthcare and education where there are very few experts with the necessary language and cultural knowledge. AI could provide services where currently nothing exists, such as culturally appropriate psychological tests or adaptive learning systems.
Evidence
In psychology, there are very few tests that are normed on indigenous minority languages, so the tests basically don’t work well for indigenous people. In education, AI has great potential to provide adaptive learning which is important for minority language speakers
Major discussion point
AI Development and Indigenous Language Inclusion
Topics
Development | Human rights
Disagreed with
– Kevin Chan
Disagreed on
Approach to AI development for indigenous languages
Indigenous people face a dilemma as data subjects requiring extra protection under GDPR, yet needing data collection to ensure AI works equitably for minorities
Explanation
There is a fundamental tension between data protection laws that classify indigenous people as a special category requiring extra strong protection, and the need to collect data from these communities to ensure AI systems work fairly for them. This creates a challenging situation where the very protections meant to help may hinder equitable AI development.
Evidence
GDPR law says it’s not allowed to collect ethnic data unless you have a really good purpose, but EU AI Act says it is allowed to collect ethnic data if the purpose is to prove that AI works as well for indigenous people as for other minorities
Major discussion point
Data Sovereignty and Community Control
Topics
Legal and regulatory | Human rights
Disagreed with
– Aili Keskitalo
Disagreed on
Data collection approach and regulatory challenges
EU AI Act prohibits discrimination against minorities but creates challenges by classifying indigenous people as special category data subjects requiring stronger protection
Explanation
While the EU AI Act mandates that educational and health services must work equally well for minority people as for others, it simultaneously creates obstacles by treating indigenous peoples as a special data category. This classification requires extra strong data protection measures that can complicate the development of equitable AI systems.
Evidence
EU AI Act says it’s not allowed to discriminate minorities such as indigenous people, but indigenous people are considered data botanists, considered a special category requiring extra strong data protection
Major discussion point
Regulatory and Legal Framework Challenges
Topics
Legal and regulatory | Human rights
Regulatory sandboxes are needed to ensure ethical and safe data collection from indigenous communities for AI development
Explanation
To overcome the challenges of collecting sensitive data from indigenous communities while maintaining ethical standards, regulatory sandboxes should be established. These frameworks would allow for controlled, ethical data collection that avoids the historical problems of racist research while enabling the development of equitable AI systems.
Evidence
Being an indigenous person myself, I know that we have historically been exposed to basically racist research where they attempted to show that indigenous people are less intelligent, but we can do this in a much more ethical way than was done in the Dark Ages
Major discussion point
Regulatory and Legal Framework Challenges
Topics
Legal and regulatory | Human rights
Valts Ernstreits
Speech speed
121 words per minute
Speech length
926 words
Speech time
459 seconds
Technology currently caters mostly to the top 200 languages globally, leaving the majority of languages, especially indigenous ones, in secondary positions
Explanation
Current technology development focuses primarily on approximately 200 languages worldwide, while the vast majority of languages, particularly indigenous languages, receive little to no technological support. This creates a hierarchy where most of the world’s linguistic diversity is relegated to secondary status in the digital realm.
Evidence
The Livonian community has less than 20 speakers in general, representing extremely under-resourced, scattered data conditions that require finding new approaches
Major discussion point
Barriers to Indigenous Language Technology Access
Topics
Development | Sociocultural
Agreed with
– Sjur Norstebo Moshagen
– Aili Keskitalo
Agreed on
Current technology systems create barriers for indigenous languages
UNESCO’s Global Roadmap for Multilingualism aims to ensure all language communities thrive in the digital age with technology that is multilingual by design
Explanation
The UNESCO Global Roadmap for Multilingualism in the Digital Era provides a framework for advancing language technologies and promoting linguistic diversity. The ultimate goal is to create technology that is inherently multilingual by design, meaning any technology should be adaptable for users of any language.
Evidence
The roadmap addresses input issues (ability to produce digital data), output issues (ability to use technologies), and process issues (community involvement in all stages of technology development)
Major discussion point
Vision for Digital Language Equality
Topics
Development | Sociocultural
Community involvement is required at all stages of technology development, not just in data handling but also in determining quality standards and publication decisions
Explanation
Indigenous and minority language communities must be involved throughout the entire technology development process, from initial design to final implementation. This involvement extends beyond just providing data to include setting quality standards and making decisions about whether and how technology should be published or distributed.
Evidence
This is not only an issue about how you handle data, but this is also about how technology is developed, including whether technology is published if it doesn’t meet quality standards of the community
Major discussion point
Data Sovereignty and Community Control
Topics
Human rights | Development
Agreed with
– Ole Henrik Bjorkmo Lifjell
– Aili Keskitalo
Agreed on
Indigenous communities must have control and ownership over their linguistic data
Outi Kaarina Laiti
Speech speed
121 words per minute
Speech length
704 words
Speech time
347 seconds
Extended reality and gaming applications face major issues in language education due to lack of discussion tools and ethical representation questions
Explanation
Virtual and augmented reality technologies cannot be effectively used in language education because they lack proper tools for conducting discussions in indigenous languages. Additionally, there are significant ethical concerns about how to represent Sámi characters and culture in games, including questions about what they should say and how they should speak.
Evidence
You cannot actually use extended reality in language education because we don’t have the tools to have discussions in virtual reality, and ethical questions arise around representation of Sámi characters in games
Major discussion point
AI Development and Indigenous Language Inclusion
Topics
Sociocultural | Development
Finland’s introduction of programming in basic education has created progress with Sámi programming guides and media archives for speech recognition training
Explanation
Finland’s decision to include programming in basic education starting from grade one has led to positive developments for Sámi language technology. This includes the publication of programming guides in three Sámi languages and the development of media archives that can be used for training speech recognition tools.
Evidence
National Audiovisual Institute has published guides for media education and programming in three Sámi languages, and there are huge Sámi media archives that have been used to train automatic speech recognition tools
Major discussion point
Practical Implementation and Solutions
Topics
Development | Sociocultural
The focus should shift from viewing language as a feature or localization liability to recognizing it as a human right on platforms
Explanation
There needs to be a fundamental philosophical shift in how platforms and technology companies view language support. Instead of treating indigenous languages as optional features or costly localization burdens, they should be recognized as fundamental human rights that must be supported on digital platforms.
Major discussion point
Regulatory and Legal Framework Challenges
Topics
Human rights | Sociocultural
Agreed with
– Valts Ernstreits
– Aili Keskitalo
Agreed on
Need for fundamental mindset shift in technology development approach
Kevin Chan
Speech speed
137 words per minute
Speech length
1248 words
Speech time
545 seconds
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Explanation
Open source AI models provide significant advantages for indigenous language communities because they can be freely accessed, modified, and customized to meet specific community needs. Unlike closed systems, open source technologies allow communities to take ownership of the adapted models and continue developing them independently.
Evidence
Meta’s No Language Left Behind translator covers 200 languages including many indigenous languages, and the Language Technology Partnership seeks 10 hours of speech recordings with transcriptions to build new open source speech technologies
Major discussion point
AI Development and Indigenous Language Inclusion
Topics
Development | Infrastructure
Disagreed with
– Lars Ailo Bongo
Disagreed on
Approach to AI development for indigenous languages
Meta has developed initiatives including Facebook translation to Inuktitut, No Language Left Behind translator for 200 languages, and Language Technology Partnership seeking community collaboration
Explanation
Meta has launched several specific initiatives to support indigenous languages, including translating Facebook into Inuktitut in collaboration with Nunavut Tungavik Incorporated, creating a 200-language translator, and establishing a partnership program that seeks community collaboration to develop new language technologies. These efforts represent concrete steps toward including indigenous languages in major technology platforms.
Evidence
The Inuktitut Facebook translation took five years of collaboration with the community, the No Language Left Behind translator is freely available on platforms like GitHub, and the Language Technology Partnership seeks partners who can provide speech recordings and text data
Major discussion point
Practical Implementation and Solutions
Topics
Development | Infrastructure
Audience
Speech speed
112 words per minute
Speech length
408 words
Speech time
217 seconds
New protocols like SOLID could enable data ownership by contributors while allowing authorized use by language model companies
Explanation
A new internet protocol called SOLID, invented by the founder of the World Wide Web, could solve data ownership issues by allowing indigenous language contributors to maintain ownership of their data while authorizing its use by AI companies. This approach could prevent digital colonization while enabling language model development.
Evidence
SOLID protocol works with LingoAI to collect data where contributors control ownership and can authorize use by Meta, OpenAI, or other companies while maintaining full GDPR compliance and ownership rights
Major discussion point
Practical Implementation and Solutions
Topics
Infrastructure | Legal and regulatory
Tawfik Jelassi
Speech speed
108 words per minute
Speech length
606 words
Speech time
336 seconds
Indigenous communities must be central to technology design, development and governance, with their knowledge systems essential for ethical digital futures
Explanation
Indigenous peoples should not merely benefit from digital technologies but must be at the center of how these technologies are designed, developed, and governed. Their knowledge systems, worldviews, and linguistic heritage are not just valuable additions but are essential components for creating ethical and inclusive digital futures.
Evidence
UNESCO supports projects like the Mayan Language Preservation and Digitalization Project which created talking glossaries, localized websites and a universal Mayan keyboard empowering millions of speakers
Major discussion point
Vision for Digital Language Equality
Topics
Human rights | Development
Building a digital future requires linguistic justice, cultural dignity and inclusive technology that speaks to hearts through indigenous languages
Explanation
Creating an equitable digital future necessitates more than just technical solutions – it requires linguistic justice, respect for cultural dignity, and truly inclusive technology development. Drawing on Nelson Mandela’s quote about speaking to people in their own language, the goal is to build technology that connects with people’s hearts and cultural identity, not just their minds.
Evidence
Nelson Mandela’s quote: ‘If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart’
Major discussion point
Vision for Digital Language Equality
Topics
Human rights | Sociocultural
David Castillo Barra
Speech speed
163 words per minute
Speech length
118 words
Speech time
43 seconds
UNESCO’s International Decade of Indigenous Languages Secretariat supports multilingualism in cyberspace with focus on fostering linguistic diversity in digital spaces
Explanation
David Castillo Barra represents UNESCO’s Secretariat for the International Decade of Indigenous Languages and works as an international consultant specializing in multilingualism promotion. His role involves supporting initiatives related to UNESCO’s recommendation on multilingualism in cyberspace with a strong emphasis on fostering linguistic diversity in digital environments.
Evidence
He serves as a member of the Secretariat for the International Decade of Indigenous Languages at UNESCO and supports initiatives related to UNESCO’s recommendation on multilingualism in cyberspace
Major discussion point
Vision for Digital Language Equality
Topics
Human rights | Sociocultural
MODERATOR
Speech speed
5 words per minute
Speech length
3 words
Speech time
31 seconds
The session opens and closes the panel discussion on barriers to indigenous language technology and AI uptake
Explanation
The moderator provides structural support for the panel discussion by opening and closing the session with musical transitions. This represents the formal framework within which the substantive discussions about indigenous language technology barriers take place.
Evidence
Musical transitions at the beginning and end of the session
Major discussion point
Panel Structure and Format
Topics
Sociocultural
Agreements
Agreement points
Need for fundamental mindset shift in technology development approach
Speakers
– Valts Ernstreits
– Outi Kaarina Laiti
– Aili Keskitalo
Arguments
Community involvement is required at all stages of technology development, not just in data handling but also in determining quality standards and publication decisions
The focus should shift from viewing language as a feature or localization liability to recognizing it as a human right on platforms
The shift must be from seeking permission to entering true partnerships with indigenous peoples as co-creators, applying free, prior and informed consent principles
Summary
All three speakers emphasize the need for a fundamental change in how technology companies and developers approach indigenous languages – moving from treating them as optional features to recognizing them as human rights, and shifting from seeking permission to establishing true partnerships with indigenous communities as co-creators.
Topics
Human rights | Development | Sociocultural
Indigenous communities must have control and ownership over their linguistic data
Speakers
– Ole Henrik Bjorkmo Lifjell
– Aili Keskitalo
– Valts Ernstreits
Arguments
Indigenous communities must have control and management of linguistic data collection that benefits their own communities, following human rights principles
The shift must be from seeking permission to entering true partnerships with indigenous peoples as co-creators, applying free, prior and informed consent principles
Community involvement is required at all stages of technology development, not just in data handling but also in determining quality standards and publication decisions
Summary
There is strong consensus that indigenous communities must maintain control over their linguistic data, with decisions about collection, use, and application being made by the communities themselves rather than external entities.
Topics
Human rights | Legal and regulatory | Data governance
Current technology systems create barriers for indigenous languages
Speakers
– Sjur Norstebo Moshagen
– Aili Keskitalo
– Valts Ernstreits
Arguments
Platform owners make life difficult for most world languages through closed systems, often without realizing it due to ignorance or negligence
Over 98% of the world’s languages lack basic digital tools, creating a threat of digital extinction rather than just a gap
Technology currently caters mostly to the top 200 languages globally, leaving the majority of languages, especially indigenous ones, in secondary positions
Summary
All speakers agree that current technology infrastructure systematically excludes indigenous languages, with the vast majority of languages lacking basic digital support and facing potential digital extinction.
Topics
Infrastructure | Development | Sociocultural
Similar viewpoints
Both speakers see significant potential in AI technologies to benefit indigenous communities, with Lars focusing on bridging equity gaps in essential services and Kevin emphasizing the value of open source approaches that allow community control and customization.
Speakers
– Lars Ailo Bongo
– Kevin Chan
Arguments
AI has great potential to bridge equity gaps for indigenous people in fields like medicine and education where cultural and linguistic expertise is lacking
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Topics
Development | Infrastructure | Human rights
Both speakers acknowledge that the technical capability exists to support indigenous languages, but the challenge lies in implementation and delivery through accessible platforms and systems.
Speakers
– Sjur Norstebo Moshagen
– Kevin Chan
Arguments
Language technology for indigenous languages is often technically possible but cannot be delivered in the apps and systems where users want to use them
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Topics
Infrastructure | Development
Both speakers emphasize that indigenous peoples must be central to technology development as rights holders and decision-makers, not merely users, to prevent the replication of colonial patterns and ensure ethical development.
Speakers
– Tawfik Jelassi
– Aili Keskitalo
Arguments
Indigenous communities must be central to technology design, development and governance, with their knowledge systems essential for ethical digital futures
AI is not neutral and can replicate colonial logics if indigenous peoples are not involved from the beginning as rights holders, not just users
Topics
Human rights | Development | Sociocultural
Unexpected consensus
Recognition of regulatory complexity and need for balanced approaches
Speakers
– Lars Ailo Bongo
– Kevin Chan
Arguments
Indigenous people face a dilemma as data subjects requiring extra protection under GDPR, yet needing data collection to ensure AI works equitably for minorities
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Explanation
It’s unexpected to see both an academic researcher and a major tech company representative acknowledge the complexity of data protection regulations and the need for nuanced solutions that balance protection with enabling equitable AI development.
Topics
Legal and regulatory | Human rights | Development
Acknowledgment of current system failures by platform representatives
Speakers
– Sjur Norstebo Moshagen
– Kevin Chan
Arguments
Platform owners make life difficult for most world languages through closed systems, often without realizing it due to ignorance or negligence
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Explanation
It’s notable that a representative from a major tech platform (Meta) implicitly acknowledges the limitations of current closed systems by advocating for open source solutions, aligning with criticism of platform barriers.
Topics
Infrastructure | Development
Overall assessment
Summary
There is remarkably strong consensus among speakers on key issues: the need for indigenous communities to control their linguistic data, the requirement for fundamental mindset shifts in technology development, the current systemic barriers facing indigenous languages, and the importance of treating language rights as human rights rather than optional features.
Consensus level
High level of consensus with significant implications – this unified voice from diverse stakeholders (indigenous rights advocates, academics, tech company representatives, and international organizations) creates a strong foundation for policy development and concrete action. The agreement spans technical, legal, ethical, and cultural dimensions, suggesting that solutions must be similarly comprehensive and that there is sufficient common ground to move forward with collaborative initiatives.
Differences
Different viewpoints
Approach to AI development for indigenous languages
Speakers
– Lars Ailo Bongo
– Kevin Chan
Arguments
AI has great potential to bridge equity gaps for indigenous people in fields like medicine and education where cultural and linguistic expertise is lacking
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Summary
Lars Ailo focuses on building AI applications for high-risk areas like education and health services, while Kevin Chan emphasizes providing open source models that communities can adapt themselves. Lars Ailo is more concerned with practical applications, while Kevin Chan focuses on the foundational technology layer.
Topics
Development | Human rights
Data collection approach and regulatory challenges
Speakers
– Lars Ailo Bongo
– Aili Keskitalo
Arguments
Indigenous people face a dilemma as data subjects requiring extra protection under GDPR, yet needing data collection to ensure AI works equitably for minorities
The shift must be from seeking permission to entering true partnerships with indigenous peoples as co-creators, applying free, prior and informed consent principles
Summary
Lars Ailo emphasizes the technical and legal challenges of data collection under GDPR while advocating for regulatory sandboxes, whereas Aili Keskitalo focuses on fundamental partnership approaches and indigenous rights principles. They differ on whether the primary solution is regulatory reform or relationship restructuring.
Topics
Legal and regulatory | Human rights
Unexpected differences
Role of regulatory frameworks versus community partnerships
Speakers
– Lars Ailo Bongo
– Aili Keskitalo
Arguments
Regulatory sandboxes are needed to ensure ethical and safe data collection from indigenous communities for AI development
AI is not neutral and can replicate colonial logics if indigenous peoples are not involved from the beginning as rights holders, not just users
Explanation
This disagreement is unexpected because both speakers are indigenous advocates, yet they approach the solution differently. Lars Ailo, despite acknowledging historical racist research, still advocates for regulatory frameworks to enable data collection, while Aili Keskitalo emphasizes that AI can replicate colonial patterns and focuses on rights-based approaches. This reveals a tension within indigenous advocacy between pragmatic regulatory solutions and principled rights-based approaches.
Topics
Legal and regulatory | Human rights
Overall assessment
Summary
The main areas of disagreement center around approaches to AI development (application-focused vs. foundational technology), data collection methods (regulatory solutions vs. partnership principles), and the balance between technical pragmatism and rights-based approaches.
Disagreement level
The level of disagreement is moderate but significant. While all speakers share the common goal of advancing indigenous language technology, they differ substantially on implementation strategies. This disagreement reflects deeper tensions between technical feasibility, legal compliance, and indigenous rights principles. The implications are significant as these different approaches could lead to very different outcomes for indigenous communities – from regulatory sandboxes that enable data collection to partnership models that prioritize community control, to open source solutions that emphasize technical accessibility.
Partial agreements
Partial agreements
Similar viewpoints
Both speakers see significant potential in AI technologies to benefit indigenous communities, with Lars focusing on bridging equity gaps in essential services and Kevin emphasizing the value of open source approaches that allow community control and customization.
Speakers
– Lars Ailo Bongo
– Kevin Chan
Arguments
AI has great potential to bridge equity gaps for indigenous people in fields like medicine and education where cultural and linguistic expertise is lacking
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Topics
Development | Infrastructure | Human rights
Both speakers acknowledge that the technical capability exists to support indigenous languages, but the challenge lies in implementation and delivery through accessible platforms and systems.
Speakers
– Sjur Norstebo Moshagen
– Kevin Chan
Arguments
Language technology for indigenous languages is often technically possible but cannot be delivered in the apps and systems where users want to use them
Open source AI technologies can be valuable for indigenous communities as they allow refinement, fine-tuning, and community ownership of adapted models
Topics
Infrastructure | Development
Both speakers emphasize that indigenous peoples must be central to technology development as rights holders and decision-makers, not merely users, to prevent the replication of colonial patterns and ensure ethical development.
Speakers
– Tawfik Jelassi
– Aili Keskitalo
Arguments
Indigenous communities must be central to technology design, development and governance, with their knowledge systems essential for ethical digital futures
AI is not neutral and can replicate colonial logics if indigenous peoples are not involved from the beginning as rights holders, not just users
Topics
Human rights | Development | Sociocultural
Takeaways
Key takeaways
Indigenous language technology barriers are primarily structural, political, and ethical rather than technical – the technology exists but cannot be delivered effectively due to platform restrictions and closed systems
Over 98% of the world’s languages lack basic digital tools, creating a threat of digital extinction, with technology currently serving only the top 200 languages globally
AI has transformative potential to bridge equity gaps in medicine, education, and other services for indigenous communities, but risks widening gaps if indigenous peoples are excluded from AI development
Open source AI technologies offer more promise than closed systems for indigenous language development as they allow community ownership, refinement, and adaptation
Data sovereignty is crucial – indigenous communities must control their linguistic data and be involved as co-creators and rights holders, not just users, throughout all stages of technology development
The mindset must shift from seeking permission to entering true partnerships with indigenous peoples, applying free, prior and informed consent principles
Language should be viewed as a human right on digital platforms rather than as a localization feature or liability
Regulatory frameworks like EU AI Act and GDPR create both protections and challenges for indigenous language AI development, requiring innovative approaches like regulatory sandboxes
Resolutions and action items
UNESCO invites all participants to contribute to the Global Roadmap for Language Technologies, which is available online for public consultation
Meta’s Language Technology Partnership is seeking collaborators who can provide 10 hours of speech recordings with transcriptions or 200+ sentences of written text to build new open source speech technologies
Participants encouraged to establish connections for further exchange on indigenous language technology issues within the Internet Governance Forum framework
Need to develop regulatory sandboxes to ensure ethical and safe data collection from indigenous communities for AI development
Requirement to follow up with policy development using human rights principles and national laws to regulate indigenous community control over linguistic data
Unresolved issues
How to practically implement community control over data while meeting technical requirements for AI training that typically require large datasets
How to balance GDPR data protection requirements for indigenous peoples as ‘special category’ subjects with the need for data collection to ensure equitable AI performance
How to shift platform owners’ mindset from centralized control to allowing communities to manage their own languages without security concerns
How to ensure AI-generated content quality supports rather than harms indigenous languages and their speakers
How to address the fundamental conflict between data ownership principles and the centralized architecture of current internet platforms
How to scale solutions beyond pilot projects to achieve meaningful global impact for thousands of indigenous languages
How to ensure indigenous women and girls have equal access to and influence over language technology development
Suggested compromises
Use of open source AI models as a middle ground that allows community adaptation while leveraging existing technological infrastructure
Development of regulatory sandboxes that balance ethical data collection needs with legal protection requirements
Adoption of new protocols like SOLID that could enable data ownership by contributors while allowing authorized use by language model companies
Focus on application layer development rather than building models from scratch, utilizing existing open source foundations
Partnership approaches where tech companies work with indigenous communities over extended periods (like Meta’s 5-year collaboration for Inuktitut translation) to ensure proper community involvement and consent
Thought provoking comments
AI has a great potential to bridge maybe the most important equity gap that sort of indigenous people are exposed to, which is the lack of experts in fields like medicine or education that has the language and cultural knowledge needed to sort of understand and provide equitable services… However, there is one big challenge, which is that the indigenous people and other minorities are considered data botanists, considered a special category. So this requires extra strong data protection.
Speaker
Lars Ailo Bongo
Reason
This comment is deeply insightful because it identifies a fundamental paradox in AI development for indigenous communities: the very legal protections designed to safeguard indigenous peoples (like GDPR’s special category protections) can inadvertently create barriers to developing AI tools that could address historical inequities in healthcare and education. Bongo illustrates this with the controversial but necessary example of needing to collect IQ test data from indigenous children to create equitable adaptive learning systems.
Impact
This comment shifted the discussion from purely technical barriers to the complex ethical and legal landscape surrounding indigenous data. It introduced the concept of ‘regulatory sandboxes’ as a potential solution and highlighted how well-intentioned data protection laws can create unintended consequences for the communities they aim to protect.
Platform owners make life hard for most of the world’s languages, but probably mostly without realizing it. I don’t think there’s bad intent behind it. It’s just ignorance or negligence… What we cannot always do is deliver the tools in the apps and the systems and the context where users want to use them. That’s the major problem.
Speaker
Sjur Norstebo Moshagen
Reason
This observation reframes the entire discussion by distinguishing between technical capability and systemic accessibility. Moshagen’s insight that the technology exists but delivery mechanisms are blocked challenges the common assumption that the primary barrier is technological development. His characterization of the problem as ‘ignorance or negligence’ rather than malice suggests different solution pathways.
Impact
This comment established the foundational premise for the entire panel discussion, setting up the central tension between having working technology and being unable to deploy it effectively. It influenced subsequent speakers to focus on structural and policy barriers rather than purely technical challenges.
For us, it’s not just about innovation, it’s about justice. It’s about the right to exist fully in our own language, not only in traditional settings, but also in e-mails, in voice assistants, in learning apps, and eventually in AI systems… Language is power, and in this digital age, the right to speak your language must include the right to shape the tools that carry it forward.
Speaker
Aili Keskitalo
Reason
Keskitalo’s framing elevates the discussion from a technical problem to a fundamental human rights issue. Her phrase ‘the right to exist fully in our own language’ powerfully articulates what’s at stake beyond mere technological access. The connection between language rights and the right to shape technological tools introduces the concept of technological sovereignty.
Impact
This comment shifted the conversation’s moral framework, moving from discussing indigenous communities as beneficiaries of technology to positioning them as rights-holders who should control their technological destiny. It reinforced the theme that emerged throughout the discussion about community involvement in all stages of technology development.
We need to shift focus from a feature or localization or a liability towards that language is a human right, and it is that on platforms as well… we need to kind of start seeing the possibilities in this, and instead of and not talk about localizations anymore.
Speaker
Outi Kaarina Laiti
Reason
This comment challenges the fundamental business model and conceptual framework of how tech companies approach language support. By rejecting the ‘localization’ paradigm entirely, Laiti suggests that treating languages as optional features or market considerations is inherently problematic. The call to see language support as a human right rather than a business decision represents a radical reframing.
Impact
This intervention prompted Kevin Chan from Meta to explicitly agree and discuss how open-source models might address these concerns. It helped establish the philosophical foundation that several other speakers built upon, particularly the idea that the entire approach to language in technology needs fundamental restructuring.
What is actually needed is this community that… community involvement in all stages of technology, because we need community contribution in order to get technology running. But at the same time, we need to make sure that technology that is produced, it is not harmful, it is ready, it corresponds for what the community needs… This is really a mind shift, because we run… with developers, even with academia, who should be kind of very well aware of issues.
Speaker
Valts Ernstreits
Reason
Ernstreits identifies a critical gap between academic awareness and practical implementation, suggesting that even well-intentioned researchers and developers fail to understand indigenous perspectives. His emphasis on community involvement ‘in all stages’ goes beyond consultation to suggest genuine partnership and co-creation. The observation about academia being unaware despite their supposed expertise is particularly striking.
Impact
This comment reinforced the emerging theme about the need for fundamental mindset changes in how technology is developed. It supported and expanded on earlier points about community control and helped establish consensus among panelists about the inadequacy of current approaches, even in supposedly progressive academic settings.
Overall assessment
These key comments collectively transformed what could have been a technical discussion about language technology into a profound examination of power, rights, and systemic barriers in the digital age. The most impactful insight was the recognition that the primary obstacles are not technological but structural – existing technology works, but delivery systems exclude indigenous languages through design choices that reflect broader power imbalances. The discussion evolved from identifying problems to articulating a vision of technological sovereignty where indigenous communities don’t just use technology but shape it. The comments created a progression from technical barriers (Moshagen) to ethical paradoxes (Bongo) to rights-based frameworks (Keskitalo, Laiti) to implementation challenges (Ernstreits), ultimately establishing that meaningful progress requires fundamental changes in how the tech industry conceptualizes language support – from market-driven localization to rights-based inclusion with genuine community partnership.
Follow-up questions
How can one add one’s own language to models from big technology companies like OpenAI, Apple, and Microsoft?
Speaker
Sjur Norstebo Moshagen
Explanation
This is identified as a major question in the discussion about making indigenous languages accessible through AI platforms from major tech companies
How to teach programming in Sámi languages and what are the cultural aspects of computing?
Speaker
Outi Kaarina Laiti
Explanation
These questions still exist after 10 years of educating children in basic education programming in Finland, indicating ongoing research needs
What should be represented when creating non-playable or playable Sámi characters in games, and how should they communicate?
Speaker
Outi Kaarina Laiti
Explanation
These are identified as ethical questions in game development that need addressing for proper indigenous representation
How to develop cognitive tests and IQ tests that are equitable and work well for minority languages and cultures?
Speaker
Lars Ailo Bongo
Explanation
This is needed for adaptive learning AI systems but requires collecting sensitive data from indigenous communities in an ethical manner
How to create regulatory sandboxes that ensure data collection from indigenous communities is done in an ethical and safe manner?
Speaker
Lars Ailo Bongo
Explanation
This is crucial for developing AI tools for indigenous communities while avoiding historical patterns of racist research
How to combine speech archives with their textual equivalents for better AI training?
Speaker
Outi Kaarina Laiti
Explanation
Large Sámi media archives exist but lack corresponding text versions needed for effective AI model training
How can communities be involved in all stages of technology development, not just data provision?
Speaker
Valts Ernstreits
Explanation
This addresses the need for indigenous communities to be co-creators rather than just data sources or end users
How to ensure AI-generated content quality supports rather than harms indigenous languages and their speakers?
Speaker
Sjur Norstebo Moshagen
Explanation
This is critical for preventing AI from perpetuating errors or inappropriate content in indigenous languages
How to shift from seeking permission to entering true partnerships with indigenous peoples in technology development?
Speaker
Aili Keskitalo
Explanation
This represents a fundamental change needed in how tech companies approach indigenous language communities
How can new protocols like SOLID help maintain data ownership while enabling AI development for indigenous languages?
Speaker
Henry Wang (Audience member)
Explanation
This explores alternative technical architectures that could solve the conflict between data ownership and AI model training needs
Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.