Analysis Archives | Page 10 of 10 | Digital Watch Observatory

The consequences of Meta’s multilingual content moderation strategies

By Alicia Shepherd-Vega

About 8 million Ethiopians use the world’s most popular social media platform, Facebook, daily. Its use, of course, is confined to the parameters of their specific speech communities. In Ethiopia, there are some 86 languages spoken by the population of 120.3 million), but 2 (Amharic and Oromo) are spoken by two-thirds of the population. Amharic is the second most popular language.

Like most countries across the globe, the use of social media in Ethiopia is ubiquitous. What sets Ethiopia apart, though, as with many countries in the Global South, are the issues that arise with developments designed by the Global North for the Global North context. This perspective becomes apparent when one views social media usage from the angle of linguistics.

Content moderation and at-risk countries (ARCs)

Increased social media usage has recently engendered a proliferation of policy responses, particularly concerning content moderation. The situation is no different in Ethiopia. Increasingly, Ethiopians blame Meta and other tech giants for the rate and range within which conflict spreads across the country. For instance, Meta faces a lawsuit filed by the son of an Ethiopian academic, Mareg Amare, whose father was assassinated in November 2021. The lawsuit claims that Meta failed to delete life-threatening posts from the platform, categorised as hate speech against Mareg’s father. Meta, earlier, had assured the global public that a wide variety of context-sensitive strategies, tactics, and tools were used to moderate content on its platform. The strategies for this and other such promises was never published, until the leak of the so-called Facebook files, brought to the fore results of key studies conducted by Meta, such as the harmful effects experienced by users of Meta’s platforms, Facebook and Instagram.

Meta employees have also complained of human rights violations, including overexposure to traumatic content, including abuse, human trafficking, ethnic violence, organ selling, and pornography, without a safety net of employee mental health benefits. Earlier this year, workers at Sama, a subsidiary of Meta in Kenya, received a ruling from a local court that Meta must reinstate their jobs after dismissing them for complaints about working under these conditions and attempts to unionise. The court later ruled that the company is also responsible for their mental health, given their overexposure to violent content on the job.

The disparity in the application of content moderation strategies, tactics, and tools used by the tech giant is also a matter of concern. Crosscheck or XCheck, a quality control measure used by Facebook for high-profile accounts, for example, shields millions of VIPs, such as government officials, from the enforcement of established content moderation rules; on the flip side, inadequate safeguards on the platform have coincided with attacks on political dissidents. Hate speech is said to increase by some 300% amidst bloody riots. This is no surprise, given Facebook’s permissiveness in the sharing and recycling of fake news and plagiarised and radical content.

In the case of Ethiopia, the platform has catalysed conflict. In October 2021, Dejene Assefa, a political activist with over 120 million followers, called for supporters to pick up arms against the Tigrayan ethnic group. The post was shared about 900 times and received 2,000 reactions before it was taken down. During this period, it was reported that the federal army had also waged war against the Tigrayans because of an attack on its forces. Calls for an attack against the group proliferated on the platform, many of which were linked to violent occurrences. According to a former Google data scientist, the situation was reminiscent of what occurred in Rwanda in 1994. In another case, the deaths of 150 persons and the arrest of 2000 others coincided with the protests that ensued following the assassination of activist Hachalu Hundessa after he had campaigned on Facebook for better treatment of the Oromo ethnic group. The incident led to a further increase in hate speech on the platform, including from several diasporic groups. Consequently, Facebook translated its community standards into Amharic and Oromo for the first time.

In light of ongoing conflicts in Ethiopia, Facebook labelled the country a first tier ‘at risk country’, among others like the USA, India, and Brazil. ARCs are at risk of platform discourse inciting offline violence. As a safeguard, war rooms are usually set up to monitor network activities in these countries. For developing countries like Ethiopia, such privileges are not extended by Facebook. In fact, although the Facebook platform can facilitate 110 languages, it only can review 70. At the end of 2021, Ethiopia had no misinformation or hate speech classifiers and had the lowest completion rate for user reports on the platform. User reports help Meta identify problematic content. The problem here was that the interfaces used for such reports lacked local language support.

Languages are only added when a situation becomes openly and obviously untenable, as was the case in Ethiopia. It usually takes Facebook at least one year to introduce the most basic automated tools. By 2022, amidst the outcry for better moderation in Ethiopia, Facebook partnered with local moderation companies PesaCheck and AFP Fact Check and began moderating content in the two languages; however, only five persons were deployed to scan content posted by the 7 million Ethiopian users. Facebook principally uses automation for analysing content in Ethiopia.

AI and low-resource languages

AI tools are principally used for automatic content moderation. The company claims Generative AI in the form of Large Language Models (LLMs) is the most scalable and best suited for network-based systems like Facebook. These LLMs are developed via natural language processing (NLP), which allows the models to read and write texts like humans do. According to Meta, whether a model is trained in one or more languages, such as XLM-R and Few-Shot Learner, they are used to moderate over 90% of content on its platform, including content in languages on which the models have not been trained.

These LLMs train on enormous amounts of data from one or more languages. They identify patterns from higher-resourced languages in a process termed cross-lingual transfer, and apply these patterns to lower-resourced languages, to identify and process harmful content. Languages with a resource gap are languages that do not have high-quality digitised data available to train models. However, one challenge with monolingual and multilingual models is that they have consistently missed the mark on analysing violent content appropriately in English. The situation has been worse for other languages, particularly in the case of low-resource languages like Amharic and other Ethiopian languages.

AI models and network-based systems have the following limitations :

They rely on machine-translated texts, which sometimes contain errors and lack nuance.
Network effects are complex for developers, so it is sometimes difficult to identify, diagnose, or fix the problem when models fail.
They cannot produce the same quality of work in all languages. One size does not fit all.
They fail to account for the psycho-social context of local-language speakers, especially in high-risk situations.
They cannot parse the peculiarities of a lingua franca and apply them to specific dialects.
Machine language (ML) models depend on previously-seen features, which makes them easy to evade as humans can couch meaning in various forms.
NLP tools require clear, consistent definitions of the type of speech to be identified. This is difficult to ascertain from policy debates around content moderation and social media mining.
ML models reflect the bias in their training data.
The highest-performing models accessible today only achieve between 70%-75% accuracy rates, meaning one in every four posts will likely be treated inaccurately. Accuracy in ML is also subjective, as the measurement varies from developer to developer.
ML tools used to make subjective predictions, like whether someone might become radicalised, can be impossible to validate.

According to Natasha Duarte and Emma Llansó of the Centre of Democracy and Technology,

Today’s tools for automating social media content analysis have limited ability to parse the nuanced meaning of human communication, or to detect the intent or motivation of the speaker… without proper safeguards these tools can facilitate overboard censorship and a biased enforcement of laws and of platforms’ terms of service.

In essence, given that existing LLM models are proven to be ineffective in analysing human language on Facebook, should tech giants like Facebook be allowed to enforce platform policies around their use for content moderation, there is a risk of stymying free speech as well as the leakage of these ill-informed policies into national and international legal frameworks. According to Duarte and Llansó, this may lead to human rights and liberties violations.

Human languages and hate speech detection

The use and spread of hate speech are taken seriously by UN countries, as evidenced by General Assembly resolution A/res/59/309. Effective analysis of human language requires that fundamental tenets responsible for language formation and use be considered. Except for some African languages not yet thoroughly studied, most human languages are categorised into six main families: Indo-European, from which we have European languages like English and Spanish, and North American, South American, and some Asian languages. The other categories are Sino-Tibetan, Niger-Congo, Afro-Asiatic, Austronesian and Trans-New Guinea. The Ethiopian languages Oromo, Somali, and Afar fall within the Cushitic and Omotic subcategories of the Afro-Asiatic family, whereas Amharic falls within the Semitic subgroup of that family.

This primary level of linguistic distinction is crucial to understanding the differences in language patterns, be they phonemic, phonetic, morphological, syntactic or semantic. These variations, however, are minimal when compared with the variations brought about by social context, mood, tone, audience, demographics, and environmental factors, to name a few. Analysing human language in an online setting like Facebook becomes particularly complex, given its mainly text-based nature and the moderator’s inability to observe non-linguistic cues.

Variations in language are even more complex in the case of hate speech, given the role played by factors like intense emotions. Davidson et al. (2017) describe hate speech as ‘speech that targets disadvantaged social groups in a manner that is potentially harmful to them, … and in a way that can promote violence or social disorder’. It intends to be derogatory, humiliate or insult. To add to the complexity, hate speech and extremism are also often difficult to distinguish from other types of speech, such as political activism and news reporting. Hate speech can also be mistaken for offensive words. And offensive words can be used in non-offensive contexts such as music lyrics, taunting or gaming. Other factors such as gender, audience, ethnicity and race also play a vital role in deciphering the meaning behind language.

On the level of dialectology, parlance, such as slang, can be used as offensive language or hate speech, depending partly on whether it is directed at someone or not. For instance, ‘life’s a bi*ch’ is considered offensive language for some models, but it can be considered hate speech when directed at a person. Yet, hate speech does not always contain offensive words. Consider the words of Dejene Assefa in the case mentioned above, ‘the war is with those you grew up with, your neighbour… If you can rid your forest of these thorns… victory will be yours’. Slurs also, whether offensive or not, can emit hate. ‘They are foreign filth’ (containing non-offensive wording used for hate speech) and ‘White people need those weapons to defend themselves from the subhuman trash these spicks unleash on us’ provide examples. Overall, hate speech reflects our subjective biases. For instance, people tend to label racist and homophobic language as hate speech but sexist language as merely offensive. This also has implications for analysing language accurately. Who is the analyst? And in terms of models, whose data was the model trained on?

The complexities mentioned above are further compounded when translating or interpreting between languages. The probability of transliteration (translating words on their phonemic level) increases with machine-enabled translations such as Google Translate. With translations, misunderstanding grows across language families, particularly when one language does not contain the vocabulary, characters, conceptions, or cultural traits associated with the other language, an occurrence referred to by machine-learning engineers as the UNK problem.

Yet, from all indications, Facebook and other tech giants will invariably continue to experiment with using one LLM to moderate all languages on their platforms. For instance, this year, Google announced that its new speech model will encompass the world’s 1000 most spoken languages. Innovators are also trying to develop models to bridge the gap between human language and LLMs. Lesan, a Berlin-based startup, built the first general machine translation service for Tigrinya. It partners with Tigrinya-speaking communities to scan texts and build custom character recognition tools, which can turn the texts into machine-readable forms. The company also partnered with the Distributed AI Research Institute (DAIR) to develop an open-source tool for identifying languages spoken in Ethiopia and detecting harmful speech in them.

Conclusion

In cases like that of Ethiopia, it is best first to understand the broader system and paradigm at play. The situation is typical of the pull and push typical of a globalised world where changes in the developed world wittingly or unwittingly create a pull on the rest of the world, drawing them into spaces where they subsequently realise they do not fit. It is from the consequent discomfort that the push emerges. What is now evident is that the developers of the technology and the powers that sanctioned its use globally did not anticipate the peculiarities of this use case. Unfortunately, this is not atypical of an industry that embraces agility as a modus operandi.

It is, therefore, more critical now than ever that international mechanisms and frameworks, including a multistakeholder, cross-disciplinary approach to decision-making, be inculcated in public and private sector technological innovations at the local level, particularly in the case of rapidly scalable solutions emerging from the Global North. It is also essential that tech giants be held responsible for the equitable distribution within and across countries with the resources needed for the optimal implementation of safety protocols concerning content moderation. To this end, it would serve Facebook and other tech giants well to partner with startups like Lesan.

It is imperative that a sufficient quantity of qualified persons with on-the-job mental health benefits be engaged. to deal with the specific issue of analysing human languages, which still have innumerable unknowns and unknown unknowns. The use of AI and network-based systems can only be as effective as the humans behind the technologies and processes. Moreover, Facebook users will continue to adapt their use of language. It is reckless to anticipate that these models would be able to adjust to or predict all human adaptive strategies. And even if these models eventually can do so, the present and interim impact, as seen in Ethiopia and other countries, is far too costly in human rights and lives.

Finally, linguistics, like all disciplines and languages, is still evolving. It is irresponsible, therefore, to pin any, let alone all, languages down to one model without the foreknowledge of dire consequences.

OEWG’s fifth substantive session: the highlights

The UN Open-Ended Working Group (OEWG) on security of and the use of information and communications technologies 2021–2025 held its fifth substantive session in July 2023. On the agenda: adopting the annual progress report (APR).

As the chair astutely noted:
‘Gaps remain on a number of issues and there is no way to finesse a gap in substantive positions. Our discussions will have to continue to build understanding to find solutions to the gaps, differences in positions and these differences are deeply held. And some of the differences have been held not just this week, not just this past 12 months, but for the last 20 years or more. So it’s challenging to try and bridge differences over overnight drafting process for issues that have eluded consensus for the last 25 years’.

During this session, the crux of the issue was that Russia and like-minded countries were disappointed by the inclusion of language and human rights, international humanitarian law, and the overemphasis on gender issues. Despite the apparent disagreement of like-minded delegations, such contentious topics should not have been incorporated without achieving a consensus. Also, the concept of a UN Convention on International Information Security was not mentioned.

Among palpable tensions, the APR was, in the end, adopted.

Threats

New expected consensual additions were praised by most countries, including the reference to the use of ICTs in current conflicts and the inclusion of ransomware despite the latter not being considered relevant by some countries during previous sessions.

Regarding critical infrastructures, South Korea’s proposal to add the energy sector to the list of sectors of peculiar concern was supported by many states. It made it to the final report, whereas the proposal to add financial institutions went unheeded. Finally, while China and Kazakhstan resisted the reference to malicious ICT activities targeting humanitarian organisations, it still made it into the APR.

Old disputes: data security

As an item listed in the OEWG mandate, China, supported by Syria, requested the group to have a more focused discussion on data security. The Netherlands, followed by several other states (e.g. Malaysia, Croatia, the UK, New Zealand, Belgium), expressed concerns regarding this reference as ‘it is not clear how this impacts international security’ and proposed referencing it in para. 14, along with the potential impact of new emerging technologies. While Australia suggested reverting to the language of the 2021 report on the issue, the USA requested the deletion of that reference as ‘it could be interpreted as elevating the issue’ along with other issues perceived as more critical. Similar criticisms were addressed to the references to misinformation, disinformation and deepfakes.

Outcome: These contentious references do not appear in the APR.

Were there any concrete proposals?

Most of the new proposals were watered down or did not make it into the APR. Among them, Kenya’s proposal for a threat repository received support from many delegations that expressed interest in furthering discussions on the issue. However, Austria, the UK, and Mexico recommended that this proposal be moved to the CBM section, as echoed by the USA. The latter, supported by Chile, expressed concerns related to this initiative duplicating other technical forums among practitioners (such as CERT to CERT channels). Nicaragua, on behalf of Belarus, Burundi, China, Cuba, North Korea, Iran, Russia, Syria and Venezuela, strongly opposed the proposal and described it as a tool for the politicisation of ICT security issues. At the same time, Cuba added that ‘it could be used for false attributions or accusations for political ends’.

Outcome: This proposal didn’t reach the APR, threat, or CBM sections.

Many delegations also expressed their concerns regarding the impact of the development of new technologies (notably AI, quantum computing and cloud technology) on cybersecurity. New Zealand, South Africa, the Netherlands, Czech Republic, Ireland, Croatia, Singapore, Vietnam, Belgium and Bangladesh also supported the proposal to hold an intersessional meeting dedicated to these emerging technologies. The USA and Russia opposed this, arguing that several UN initiatives on emerging technologies (such as the GGE on LAWS) already cover these issues. Austria recommended having a focused discussion on how these technologies affect cyber specifically. Finally, Colombia, supported by Fiji, proposed a meeting where states victims of cyberattacks could share their experiences, lessons learned, protocols and best practices.

Outcome: Any reference to these new technologies was deleted from the report. A less focused intersessional meeting ‘on existing and potential threats to security in the use of ICTs’ with relevant experts’ participation was recommended as the next step.

What did stakeholders say?

Stakeholders emphasised the crucial role of non-governmental actors in comprehending and addressing threats that disproportionately affect vulnerable groups. They also highlighted the significance of these actors in ensuring that the efforts of the OEWG encompass a gender perspective, amplify youth voices, and work towards bridging the digital divide in both low and high-income countries.

The proposal presented by Colombia and other delegations garnered widespread support for its aim to facilitate the contributions of non-governmental stakeholders in the proposed repository of threats. Furthermore, stakeholders highlighted the value of information exchange and incident response that extends beyond the state level. These stakeholders can function as trusted intermediaries, offering insights into incidents that attack common civil society targets like human rights defenders and journalists, thereby contributing to more effective countermeasures.

Specific recommendations put forth by stakeholders included the addition of energy and water facilities as critical infrastructure in the Threats section of the APR proposed by Hitachi America. Additionally, Safe PC Solutions called for the inclusion of emerging security threats related to 5G broadband technologies. Moreover, Access Now stressed the need for a concrete acknowledgement of the cyber threats and capabilities against humanitarian actors and human rights defenders.

Rules, norms and principles

Old disputes: implementation vs development

The existing fault lines in opinions resurfaced again. In the section on norms, most member states have supported the implementation of the 11 existing voluntary norms before exploring the need for additional norms. According to these member states, the development of new norms is premature. On the other hand, Russia, China, Cuba, and others consider focusing on implementing existing norms to be outside of the mandate of the OEWG and think that the development of additional norms and new legally binding obligations should be the main agenda of the OEWG.

Some states were not satisfied with the level of emphasis put on implementation: for instance, Australia suggested that in the section on rules, norms and principles para 23 f) notes that states stressed the need for further focus discussions on implementing the rules, standards, and principles of responsible state behaviour in the use of ICTs, adding the word ‘implementing’ to the original phrasing.

Many states emphasised the importance of the private sector in the integrity, stability, and security of supply chains and cyberspace, which is now reflected in Art. 23e) of the APR. Other discussions related to critical infrastructure, critical information infrastructure, and the safety and integrity of supply chains (Art. 23 c), d) APR).

A group of states also resurrected the proposal to establish a voluntary glossary of national definitions of technical ICT terms, which was declined by most states as they needed more consensus. Suggestions were made to include this glossary as part of CBMs.

A new debate – glossary of terms

This time, states disagreed over a new topic – a glossary of terms. Some states (e.g. Switzerland, the UK, New Zealand, South Africa, etc.) did not support the proposal and asked to remove this from the progress report. They argued that states could more usefully continue to share national policies and their statement on international law and threat information. Some countries (e.g. Kazakhstan and Iran) disagreed with deleting this proposal.

A new proposal – substantiation of accusations

Russia suggested supplementing the section on norms with the provisions that accusations of wrongful acts with the use of ICTs brought against states must be substantiated, and that computer incident response must not be politically motivated.

OutcomesThe final wording of the APR (Art. 23 f)) includes a focus on implementing norms to which the opposing states agreed in the spirit of goodwill and compromise. A mention of the possibility of future elaboration of new legally binding obligations within OEWG found its place in Art. 29 b) I and in Art—32 of the APR, with a footnote referring to a proposal. The reference to the glossary of the terms has been removed from the final draft.

What did stakeholders say?

Stakeholders highlighted the importance of developing a norms checklist with a comprehensive and coordinated approach to capacity development and the significance of regional-level implementation by leveraging regional organisations’ expertise.

International law

The statements at the session clearly reflected that over the past year, the member states have advanced in explaining their positions and clarifying their points of disagreement on both norms and international law, thus making drafting the APR language more challenging.

Discussion on international law has built upon the intersessional meeting in May 2023. There are two key opinions present.

Most states reaffirm that international law, including the UN Charter, applies in cyberspace. This group proposed to deepen the discussion on how international law applies (Art. 30 of APR) and focus on sovereignty and sovereign equality, due diligence, respect and protection of human rights. The proposals within this group of states also included a direct reference to Art. 2(3), Art. 2(4) and Art. 33 of the UN Charter (Art. 30 a)-c) APR) and international humanitarian law’s applicability (Art. 29 b) ii APR).

Another group of states insists on discussing a new legally binding instrument to regulate the state’s behaviour in cyberspace (Art. 29 b) i APR). The proposal by Argentina and South Africa to involve the International Law Commission in the discussions on the applicability of international law to cyberspace did not find support.

There were, however, proposals that have found support from all across the board – on the need to hold dedicated inter-sessional meetings on how international law applies to cyberspace (Art. 35 APR) and on capacity building in international law (Art. 36 APR).

Which were old disputes?

Russia and Iran noted that the report needs more references to formulate a legally binding instrument, with Iran stating that para 32 contains a weak reference, which they found insufficient. China requested that para 32 be deleted, or that additional wording be added under the section on Norms accordingly. Estonia, on behalf of Australia, Colombia, El Salvador and Uruguay, proposed an alternative language to article 32 of Rev 2: States discuss the need to consider whether any gaps exist in how existing international law applies in the use of ICTs and whether further to consider the possible development of additional legally binding obligations if appropriate. The USA, New Zealand, and Switzerland supported this edit.

Para 32: Noting the possibility of future elaboration of additional binding obligations, if appropriate, States discussed the need to consider whether any gaps exist in how existing international law applies in the use of ICTs and further consider the development of additional legally-binding obligations.

Australia suggested changing the word ‘norms’ to ‘obligations’ in para 30 because the word ‘norms’ in the original text is used in the context of this OEWG, slightly differently from how it is often used in international law. Many delegations, such as South Korea, Switzerland, Japan, and Austria, supported this edit. The USA called new references to norms in the international law section ‘muddying of waters.’

Are there any new debates?

At the same time, states shared disagreements on human rights in the progress report: Germany first proposed adding the reference to human rights, and several countries (e.g. Switzerland, the EU and its member states, New Zealand, etc.) supported this proposal. Another group of like-minded States (Russia, Iran, China, Cuba, etc.) shared that they were “disappointed” by the inclusion of language on human rights in the final text. These countries argued that IHL and the overemphasis on gender issues should not have been incorporated without achieving consensus.

Were there any concrete proposals?

States discussed the proposal for conducting an intersessional on international law, and the Netherlands and Mexico proposed to broaden the list of relevant briefers (in para 33) so the OEWG can benefit from the expertise of stakeholders, including from regional and sub-regional organisations, businesses, NGOs, and academia. Some countries (e.g. the UK, Switzerland, Croatia) strongly supported this proposal.

Concerning the same para 33, South Africa proposed amending the language and replacing ‘developing a common understanding of the applicability of international law” to “better inform the OEWG’s deliberations’, arguing that States should not be forced and the OEWG should let the conversation about the applicability of the international law develop in a bottom-up manner.

Australia stressed that it does not support reference to the UN Secretariat compiling national views, noting this would be a duplication of existing efforts, such as those undertaken by UNIDIR.

Outcomes

Both formulations ‘norms’ and ‘objectives’ have been removed from para 30 of Rev 2.

What did the stakeholders say?

Stakeholders reinstated the centrality of IHL and human rights in discussions on international law as applied to cyberspace and the importance of stakeholders in helping contextualise norms to their local and national contexts by developing and contributing to working papers, guidance and checklists.

ICT for Peace Foundation urged further discussion on how peaceful settlement of disputes, state responsibility for incidence and state response options principles would translate to ICT in cyberspace.

CBMs

Are there any new debates?

Regarding the POCs, Russia expressed the view that the global intergovernmental POCs directory should become the “centrepiece in organising interaction of countries in response to computer attacks/incidents”. In this regard, Russia considered it inappropriate to limit cooperation between POCs to incidents with possible implications for international peace and security. Instead, the interaction between PoCs should be built on an ongoing basis, regardless of the significance of a computer incident. On the other hand, Switzerland noted that the PoCs network will complement the work of CERTs and CSIRTs in cases of ICT incidents with possible implications for international peace and security.

An unsolved issue is the nature of PoCs, which will be nominated for the directory. India noted that states should remain flexible on having multiple technical or operational POCs. India suggested the integration of the POC Directory Module with the Global Cyber Security Cooperation Portal – a mechanism proposed earlier by the Indian delegation. Ghana recommends that this nomination be made at a technical, policy, and diplomatic level due to the differences in capacities.

What did the stakeholders say?
In the context of track 2 processes, stakeholders encouraged delegations to partner with the private sector. These informal dialogues serve as a means to establish or re-establish mutual trust among involved parties. Furthermore, these dialogues are crucial in aiding states in co-creating a comprehensive set of CBMs.

Capacity building

Which were old disputes?

Iran notes that their recommendation for creating a new capacity-building mechanism under the UN has been disregarded. Instead, the focus solely revolves around enhancing coordination among existing mechanisms, which Iran cannot support.

Some states (e.g. Indonesia, Vietnam, and the Netherlands) supported considering gender perspective in capacity building. In contrast, a group of like-minded states such as Russia, Cuba, China, Venezuela, Nicaragua, Iran and others have not supported adding the gender-related wording. Iran and Russia wanted gender removed from the report, and Iran specifically wanted para 43 A, which relates to preparing a survey to identify countries’ needs regarding gender equality in the field of ICT security, removed.

Are there any new debates?

Indonesia proposed to connect the mapping of capacity building programmes to the implementation of the frameworks’ recommendations. The USA strongly supported it, while some states (e.g. Australia, Japan, and New Zealand) raised concerns about resources in conducting such a mapping. The USA and Japan, in particular, called for the making of most of the existing capacity-building efforts undertaken by other international organisations such as ITU. The Netherlands said that the text is missing the sub-regional aspects and proposed that it be added to reflect efforts from the regional level. The EU shared the same view and suggested that the UN could encourage and serve as a platform to enhance the implementation of the UN agreements and stipulate capacity building in this context, including cooperation with the multistakeholder community. Egypt believed the progress report should not refer to specific regional or sub-regional organisations. Australia disagreed and stressed the importance of mentioning concrete organisations, such as the GFCE. Hungary shared the view that while mapping is needed to coordinate better the efforts of the growing number of donors and implementers, the UN could undoubtedly play a complementary role. Still, other stakeholders have their roles to play.

What did the stakeholders say?

Stakeholders emphasised the significance of regional and cross-regional formats for sharing best practices and identifying capacity building needs to align these with national, regional, and international conferences.

Stakeholders also underlined the importance of mainstreaming Capacity Building Principles into capacity building efforts. The organisation Developing Capacity mentioned the opportunity of doing this at the Global Conference on Capacity Building in Ghana next November.

Finally, a concrete proposal was made by the ICC in the name of 21 other stakeholders to add language that explicitly states that the OEWG should consider how cybersecurity considerations and good practices can be integrated into future digital development projects.

Regular institutional dialogue

Which were old disputes?

To PoA or not to PoA

The division among delegations was stark, splitting Cuba, Iran, Pakistan, Syria and Russia on one side and the EU, the USA, Korea, France, and other Western democracies on the other. The critical point of contention lay between those favouring the Program of Action (PoA) and those advocating for equal consideration of all country proposals.

Cuba and Iran were proponents of inclusivity, urging the incorporation of all future mechanism proposals into the report. Russia voiced concerns about the existing draft, arguing that the section on regular institutional dialogue was biassed in favour of the PoA. Syria asserted that prioritising the PoA gave the impression of broad consensus, contrary to the working group’s mandate to consider various security initiatives. Syria also noted that discussions revealed differing viewpoints on the effectiveness of the PoA and recommended evaluating it before any definitive steps.

Conversely, the US strongly criticised these states’ push for an authoritarian revision of the consensus framework, pointing out that their proposal lacked substantial backing and had been repeatedly dismissed over the years. They maintained that proposals should only be included in the report if they garnered significant support.

Portugal and Korea also supported the PoA, citing its considerable support under the UN’s umbrella, referencing broad approval from member states through General Assembly resolution 7737.

The EU emphasised that the PoA could enhance transparency, credibility, and sustainability in decision implementation.

Finally, China introduced a potential compromise, suggesting compiling common elements from various positions and proposals to reduce differences and find convergences. They emphasised the importance of a balanced representation of all parties’ positions in the report.

Legally binding vs deletion of 49 C (bis)

Pakistan, Iran and Russia advocated for the work of the future mechanism to be based on the recommendations of the OEWG and the possibility of crafting a legally binding ICT instrument within that framework.

However, several delegations, including Belgium, Korea, the EU, the USA, and Japan, among others, supported France’s proposal to remove paragraph 49 C bis due to concerns about incorporating language about a legally binding instrument. Korea viewed such an instrument as premature and suggested the deletion of 49 C bis, aligning with the perspectives of other countries like the EU, the USA, Japan, and France, that if it were included at all, it should be under the international law section. Vietnam also deemed paragraph 49 C inappropriate in acknowledging diverse views and ideas discussed in the working group, echoing the language from the 2021 OEWG report.

State-led vs intergovernmental

Similarly, the block of Russia, Syria and China supported the proposals made by Cuba and Iran, among other countries, to change state-led to intergovernmental regarding fifth paragraph 53. Conversely, Western democracies defended the state-led nomenclature.

Are there any new debates?

A new debate about the consensus in the future regular institutional dialogue emerged: France noted that para 56. should not prejudge that the decision-making processes in the future mechanism will be consensus-based. Australia and Austria supported France’s suggestion. Iran said that paragraph 56 does not reflect the need to pay attention to a step-by-step negotiation approach. The USA noted that para 56 is too prescriptive – states do not need to agree by consensus on establishing a future mechanism for regular institutional dialogue, as the General Assembly does not require it. Austria supported this view.

Were there any concrete proposals?

The IPS forum (India, Brazil and South Africa) proposed a comprehensive institutional dialogue mechanism encompassing crucial aspects of the ICT environment, including trust-building and deeper discussion on aspects lacking common understanding. This mechanism should be intergovernmental, open, inclusive, transparent, flexible, and action-oriented, operating by consensus to prevent stagnation while avoiding a potential veto power.

Vietnam suggested that the future mechanism should build upon the efforts of GGE and OEWG, as indicated in paragraph 51. In the same paragraph, Bangladesh proposed this mechanism should have a multistakeholder approach. Numerous countries advocated for dedicated intersessional meetings to delve into specific discussions and elaborate on the modalities of the PoA.

The US proposed inserting a new paragraph 49 b, highlighting discussions on UNGA resolution 7737. This resolution supports a new Program of Action for responsible state behaviour in cyberspace, including an SG report on scope, structure, and content, to be discussed at the OEWG after release in July 2023.

France, the EU, and the US proposed that the APR reflects the SG report and intersessional meeting outcomes regarding the POA’s modalities. Additionally, the UN Secretariat was requested to brief the OEWG during the OEWG’s sixth session about the POA’s scope, content, and structure.

The Philippines underlined the importance of addressing the gender digital divide in future dialogues, alongside promoting meaningful participation and leadership of women in future decision-making mechanisms. In a complementary vein, Nigeria proposed incorporating responsible state behaviour and an online child protection mechanism, aligning with efforts to combat online gender exploitation. Australia recommended embedding these proposals as a fundamental principle within paragraph 51. While not in Section G of the final APR draft, the gender perspective is mentioned in the Threats and Capacity Building sections.

Outcomes
To settle divergence on the proposals for the future mechanism, the APR reflects other proposals made for regular institutional dialogue while highlighting the progress made in discussing the PoA (para 52b). The wording on the future permanent mechanism followed the compromise suggested by China. As an initial step to building confidence and convergence, States will propose some common elements that could underpin the development of any future mechanism for regular institutional dialogue (para 53). This approach aims to build consensus while maintaining discourse on the suggestions highlighted in subparagraphs 52(a) to 52(b).
Other noteworthy aspects integrated into the APR encompassed focused dialogues on the relationship between the PoA and OEWG and acknowledgement of the relevance of previous OEWG and GGE work (paragraph 55c), both proposals made by Vietnam. Additionally, paragraphs 52, ‘an open, inclusive, transparent, sustainable and flexible process’ and 52a, ‘understanding in areas where no common understandings have yet emerged’, reflected the suggestions made by the IPS forum. Additionally, the engagement of various stakeholders, including businesses, NGOs, and academia, was recognised as pertinent, so Bangladesh’s proposal was included in paragraph 57. The proposal on dedicated intersessional meetings to continue discussions on the PoA received broad support. It was included in paragraph 58 with the amendment ‘to further discuss proposals on regular institutional dialogue, including the PoA.’As per the proposal by the US, there was no mention of UNGA resolution 7737. However, the Secretariat was still requested to brief the OEWG at its sixth session on the report of the Secretary-General submitted to the General Assembly at its seventy-eighth session.

What did the stakeholders say?

Stakeholders supported the proposals that the future permanent mechanism should be multistakeholder. Access Now proposed discussions on the PoA would benefit from even further openness and planning around how stakeholders can contribute.

Networked journalism and online media: Reimagining trust for digital reporters

By Luka Avramović

Shifting landscapes and multiple discourses

The digital changes in the topography of journalism have, for better or worse, resulted from two essential shifts in how information circulates in society. One tectonic change shows in the amount and types of actors that engage in news reporting have massively increased due to the accessibility afforded by the Fourth Industrial (or Digital) Revolution. From non-governmental private companies, such as social media conglomerates Meta and Alphabet, to individuals with information power, like Julian Assange and Elon Musk, to everyday consumers like you and me, the new engagement paradigm is in full swing. For another, and leading on from the first, the sheer abundance of information being shared through different media and platforms has reinforced the plurality of discourse. Different sources, communities, filter bubbles, and political or personal biases shape what information appears, who is writing it, and where and why.

A question of (anti-)social media?

In a rapidly evolving technological landscape, the internet and social media have revolutionised how information is disseminated. However, this transformation does not necessarily translate to improved journalism.

With greater accessibility and connectivity for both citizens and reporters, concerns are mounting over the proliferation of biassed information. Recent pivotal examples are the COVID-19 pandemic, ongoing conflicts in Ukraine or in central Africa and elections worldwide (particularly in the USA and Türkiye – where the interplay between governments, private media companies, and individuals has increased political tensions across society). The surge in online news outlets and social media has further exacerbated the situation, providing a platform for individuals to disseminate biassed, misleading, or inaccurate information. Consequently, this will reach a wider audience, giving rise to a host of emerging issues in the media landscape, like disengagement or polarisation.

Recent trends in news consumption

Over the past year, trust in the news has experienced a notable decline, dropping by an additional 2 percentage points across various markets worldwide, according to Reuters 2023 Digital News Report. In many countries, this setback has undone the progress made during the peak of the COVID-19 pandemic, when trust in broadcast and paper news sources had witnessed an upswing.

Their research has also shown that users on rapidly popularising short-form platforms such as TikTok, Instagram, and Snapchat are notably more inclined to pay attention to updates from celebrities, influencers, and social media personalities rather than relying on professional journalists. In stark contrast, and counterintuitively, Facebook and Twitter maintain their status as platforms where news media, journalists and reporters remain central in shaping conversations.

Three improvements any journalist should institute

In the quest to tackle trust issues between sources and journalists, the crux of the matter lies in the power balance. The powers at play are the selection of a plurality of sources or one absolute ‘truth’. In other words, do we recognise that there is no universal truth and seek to include diverse perspectives, or do we trust that there is a ubiquitous, truth that can be proved and reported on. Acknowledging this fundamental challenge, experts worldwide emphasise that journalists must take charge and adapt to the digital era. By embracing the online environment to their advantage, journalists can consolidate their position and credibility, thereby enhancing public trust in their work.

To achieve this end, journalists must think of information no longer as a product, but as a service that the news media should be responsible for delivering. With the abundance of interconnected information sources in today’s society, expecting reporters to single-handedly provide all-encompassing news coverage has become impractical. Instead, experts propose a shift in focus, emphasising journalists’ role as arbiters, curators, and information filters (Broersma & Graham, 2016; Beckett, 2018; Dahlgren, 2009). By becoming gatekeepers of trustworthy information, they can guide audiences through the sea of media confusion that characterises modern life.

Second, journalists might want to foster open discourse by providing contrarian opinions while removing themselves from the perceived role of absolute authority. Doing this, allows journalists or reporters to effectively and accessibly communicate knowledge while allowing room for healthy debate and critical examination.

Third, the dynamic power of so-called citizen journalists should not be underestimated. Journalists should see the increased involvement of members of the public gathering and spreading news and information as a tool, not as a constraint. At a time when many news organisations face staff cutbacks, citizen journalists have emerged as valuable contributors who play a crucial role in monitoring society using online resources and social media.

Takeaway

Amidst the ongoing mis- and dis- information crises, credible sources and information filtering emerge as a potent antidote, fostering a fresh perspective on information management within the field of news journalism. Good information and good journalism empower people through knowledge and allow individuals to make informed decisions. Emphasising the pivotal role of reliable news reporting, this approach bolsters the belief that trustworthy journalism remains integral to the fabric of society.

References

Beckett, C. (2018). The Paradox of Power for Journalism: Back to the Future of News [new book]. London School of Economics. https://blogs.lse.ac.uk/polis/2018/11/23/the-paradox-of-power-for-journalism-back-to-the-future-of-news-new-book/

Broersma, M. & Graham, T. (2016). ‘Chapter 6: Tipping the Balance of Power: Social Media and the Transformation of Political Journalism’, in Burns, A. (ed.) The Routledge Companion to Social Media and Politics. New York: Routledge, pp. 89–103.

Dahlgren, P. (2009). Media and Political Engagement: Citizens, Communication and Democracy. Cambridge; New York: Cambridge University Press, pp. 172–181.

Newman, N., Fletcher, R., Eddy, K., Robertson, C. T., & Kleis Nielsen, R. (2023). Reuters Institute Digital News Report 2023. Reuters Institute for the Study of Journalism. https://reutersinstitute.politics.ox.ac.uk/sites/default/files/2023-06/Digital_News_Report_2023.pdf

Worldcoin: Eye-scanning ID is here

Worldcoin history

Back in the golden era of blockchain (2018-2019), when questions raised in everyday conversation were promised to be solved by this technology, a group of people started working on an ambitious project called Worldcoin. This project tried to find a solution for the challenge of unique online identification (our so-called digital identity). In particular, Worldcoin developed a system for recording and storing users’ digital biometric data and offering them a reward in the form of digital tokens. The data that Worldcoin gathered were iris scans. To join the user base, people would go to the designated location and consent to have their irises scanned. This was done using a shiny spherical object they named Orb. In the short period that Orb collected data, a significant database of human irises was collected. Least-developed countries had the most users, as was generally expected, because Worldcoin guaranteed tokens as incentives (i.e. money) ‘simply for being human’.

Wordcoin’s Orb custom biometric device. Source: worldcoin.org

The technology behind the identification scheme is the following: Iris scans were digitally obfuscated using a hashing function (this is a cryptography technique in which one set of digital data can be encrypted to match a unique digital key for reading these data). That unique hash was added to the database as each person’s unique identifier. Even though this data is encrypted, significant concerns were raised that a possible data breach could create a privacy and data nightmare. The crypto community had serious concerns about a scary dystopian future, undermining the project. The Worldcoin project was almost forgotten and was considered one of the most ambitious and yet obscure in the crypto community.

The rebirth and rebranding of Worldcoin

Fast forward to 2022, when the Worldcoin project leader, Sam Altman, became globally famous as OpenAI’s CEO. Only half a year after the ambitious ChatGPT launch and global excitement about the predictive language models, Altman pushed the ‘old’ Worldcoin idea into the public space again.

Earlier this week, the ‘new’ Worldcoin project launched worldwide, but with one significant difference. It is being publicised as ‘a new identity and financial network owned by everyone’. The rebranding is important, because now, the project team claims that what they are building is not distinguishable from the Public Key Infrastructure (PKI) deployed by big companies or the technical internet society. PKI is a set of standards, software, and hardware used in digital certificates and for managing public-key encryption. This is done via certificate authorities, with one of the most notable implementations being the HTTPS protocol used for secure web browsing. Worldcoin will use a cryptographic technique known as zero-knowledge proof or ZKP.

This obfuscating technique allows verification that the ‘given statement is true while avoiding conveying any information to the verifier beyond the mere fact of the statement’s truth’. This technique is used in some privacy-oriented cryptocurrencies, and it demonstrates the possibility of user-defined online privacy divisions allowing options to decide what information you want to share with whom. For example, your browser doesn’t need to know all your credentials and data. In fact, it only uses your IP (for geolocation) and information like gender or age for advertising or other purposes. ZKP solutions were tested in COVID-19 tracking apps and are at the core of the EU’s new Digital Identity proposal. Significant concerns exist about the gatekeepers of certificate authorities that store the data. This issue is crucial for sensitive data, such as biometric data collected by the Orbs.

How is this data stored? Is any unencrypted version of the iris data stored in a secure manner (e.g. in the Orb’s temporary internal memory)? Who has access to this data? Or even worse, can it end up on the black market or be misused somehow? In its launch report, Worldcoin claimed that: ‘The Orb sets a high bar to defend against scalable attacks; however, no hardware system interacting with the physical world can achieve perfect security’.

One way of looking at Worldcoin is that it is very similar to Apple’s PKI, and there is nothing to be worried about. One difference with Worldcoin is that part of the identifier data will be stored inside Ethereum’s public, open-source blockchain, while World IDs are issued on the Worldcoin protocol The Worldcoin protocol was developed by Tools For Humanity, a company established by the founders of the original Worldcoin project: Alex Blania and Sam Altman. The design ensures that no trusted third party can introduce risks of data handling or accountability related to it. Users have control of the process. However, the past has shown us that human users are usually the weakest link. Human factors include the very real risk that users will share their biometric data like they share their ultrasounds. So far technology has not found a way to limit voluntary violations of privacy and security. The UK data watchdogs at the Information Commissioner’s Office, have already announced a probe into Worldcoin’s privacy and data protection practices.

Interactive infographic available at https://worldcoin.org/home shows a global map marked with different colors of dots showing global users of World IDs, transactions, Activities, and Milestones. — The Worldcoin home page shows this interactive map of its global users.

Another part of the project also makes it significantly different from known PKI schemes, and it’s a digital currency reward that actors get for sharing their biometric data.

Worldcoin was not accessible in the USA at its launch, and anyone wishing to participate had to confirm that they were outside the USA. The Worldcoin launch report clearly stated that tokens distributed in the system will be only available where laws allow this to happen.

Why is this important?

Aside from the technological, privacy and data protection, and other ethical questions raised, the financial incentives and infrastructure that are underlying the project will also be scrutinised.

Only a couple of years ago, Meta (then Facebook) and Mark Zuckerberg announced the launch of the Libra digital token, which, in their words, could offer a solution for cross-globe payments in different currencies across all Meta apps (Facebook, Instagram, and WhatsApp). Meta signed agreements with major payment institutions like Visa and Mastercard and giant online retailers like Ebay, but US legislators torpedoed the project. In three separate hearings in front of the US regulators in the Senate and House, the USA made it clear that no digital coin issued by a private company can be considered an international means of payment, particularly if it is pegged to or in any way related to the US dollar, which is regarded as a global reserve currency. The Libra project was shut down after two years, and mentions of Libra were erased from company websites.

Digital currencies issued by private companies remain of primary interest to major state powers and international financial organisations, like the USA and the UK and the Bank for International Settlements or the G7’s Financial Stability Board. This, in fact, might be a more significant obstacle for Worldcoin than data collection and privacy issues.

Worldcoin promotes the ’proof of personhood’ idea, which establishes an individual as both human and unique, and might become indispensable to discern and identify AI identities, like bots, bot farms, and ‘fake humans’. We will certainly hear more about this project.

MOVEit hack: what is it and why is it important?

A string of disclosures

On 31 May, Progress Software Corporation disclosed that its managed file transfer (MFT) software, MOVEit Transfer, is susceptible to a critical SQL injection vulnerability, which allows unauthenticated attackers to acquire access to MOVEit Transfer databases.

On 2 June, the vulnerability received the designation CVE-2023-34362. CVE stands for Common Vulnerabilities and Exposures ID number, which is assigned for publicly disclosed vulnerabilities. Once a CVE is assigned, vendors, industry and cybersecurity researchers can exchange information to develop remediation.

On 9 July, Progress announced additional vulnerabilities (CVE-2023-35036), which were identified during code reviews. The company also released a patch for new vulnerabilities. On 15 June, a 3rd vulnerability was announced (CVE-2023-35708).

Threat actors have attacked more than 162 known victims, including the BBC, Ofcom, British Airways, Ernst and Young, Siemens Energy, Schneider Electric, UCLA, AbbVie, and several government agencies with these zero-day vulnerabilities. Sources also report the compromise of the personal data of more than 15.5 million individuals.

Behind the attack

Microsoft attributed the MOVEit hack to Lace Tempest, a threat actor known for ransomware attacks and for running the extortion website of the CLOP ransomware group, data theft, and extortion attacks. On 6 June, the CLOP ransomware gang posted a communication to their leak site demanding that victims contact them before 14 June to negotiate extortion fees for deleting stolen data.

The identity and whereabouts of the CLOP gang remain unknown to the public. However, security researchers believe the group is either linked to Russia or comprises Russian-speaking individuals.

Supply chain security flaws

The MOVEit hack has again highlighted that supply chain security is a significant concern for industries and the public sector. Across the supply chains, who is responsible for what? And how can we ensure cross-sectoral and cross-border cooperation between multiple actors that mitigate security risks?

While national cybersecurity agencies continue publishing guidance on mapping and securing supply chains, the industry implements good practices for reducing vulnerabilities and building secure ICT infrastructures. Still, organisations have different levels of maturity and resources to respond effectively. Luckily, there are ongoing discussions at different levels to address these topics: from international levels to advance the implementation of the relevant UN GGE norms to reduce vulnerabilities and secure supply chains, such as the Geneva Dialogue, to national and industry-specific discussions to develop and adopt new security measures (e.g. SBOM).

Another challenge lies in conducting effective investigations, with the participation of several states and/or private partners, to identify a threat actor and stop the activity.

Digital policy trends in June 2023

Governing AI: What are the appropriate AI guardrails?

AI governance remains the number one trend in digital policy as national, regional and global efforts to shape AI guardrails continue.

The EU’s risk-based approach

The European Parliament’s approval of the AI Act is a groundbreaking development. This regulation classifies AI systems based on risk levels and safeguards of civil rights, with severe fines for violations. Next in the legislative process is the so-called trialogues, where the European Parliament, the EU Council, and the Commission have to agree on a final version of the act; there are expectations that this agreement will be reached by the end of the year.

A new study from Stanford suggests that leading AI models are still far off of the responsible AI standards set by the AI Act (the version agreed in the EP), notably lacking transparency on risk mitigation measures. But some in the industry argue that the rules impose too heavy a regulatory burden. A recent open letter signed by some of the largest European companies (e.g. Airbus, Renault, Siemens) notes that the AI Act could harm the EU’s competitiveness and could compel them to move out of the EU to less restrictive jurisdictions. Companies are, in fact, doing their best to shape things: For example, OpenAI lobbied successfully in the EU that the forthcoming AI Act should not consider OpenAI’s general-purpose AI systems to be high risk, which would trigger stringent legal requirements like transparency, traceability, and human oversight. OpenAI’s arguments align with those previously employed by the lobbying efforts of Microsoft and Google, which argued that stringent regulation should be imposed only on companies that explicitly apply AI to high-risk use cases, not on companies that build general-purpose AI systems.

Given the EU’s track record on data protection rules, its proposed AI Act was anticipated to serve as an inspiration to other jurisdictions. In June, Chile’s Parliament initiated discussions on a proposed AI Bill, focusing on legal and ethical aspects of AI’s development, distribution, commercialisation, and use.

More regional rules are in the works: It has been revealed that ASEAN countries are planning an AI guide that will tackle governance and ethics. In particular, it will address the use of AI for generating misinformation online. The guide is expected to be adopted in 2024. Strong dynamism will occur during Singapore’s chairmanship of ASEAN in 2024.

Business-friendlier approaches

Considering that Singapore itself is taking a collaborative approach to AI governance and is focused on working with businesses to promote responsible AI practices, the ASEAN guide is not likely to be particularly stringent (watch out, EU?). Softer, more collaborative approaches are also expected to be formulated in Japan and the UK, which believe such an approach will help them position themselves as AI leaders.

Another country that is taking a more collaborative approach to AI governance is the USA. Last month, President Biden met with Big Tech critics from civil society to discuss AI’s potential risks and implications of AI on democracy, including the dissemination of misinformation and the exacerbation of political polarisation. The US Commerce Department will create a public working group to address the potential benefits and risks of generative AI and develop guidelines to effectively manage those risks. The working group will be led by NIST and comprise representatives from various sectors, including industry, academia, and government.

Patchwork

As countries continue their AI race, we might end up with a patchwork of legislation, rules and guidelines that might espouse conflicting values and priorities. It is no surprise that calls for global rules and an international body are also gaining traction. A future global AI agency inspired by the International Atomic Energy Agency (IAEA), an idea first put forward by OpenAI CEO Sam Altman, has garnered support from UN Secretary-General Antonio Guterres.

France is advocating for global AI regulation, with President Macron proposing that the G7 and the Organisation for Economic Co-operation and Development (OECD) would be good platforms for this purpose. France wants to work alongside the EU’s AI Act while advocating for global regulations and also intends to collaborate with the USA in developing rules and guidelines for AI. Similarly, Microsoft’s President Brad Smith called for collaboration between the EU, the USA, and G7 nations, adding India and Indonesia to the list, to establish AI governance based on shared values and principles.

In plain sight: SDGs as guardrails

However, the road to global regulations is typically long and politically tricky. Its success is not guaranteed either. Diplo’s Executive Director Dr Jovan Kurbalija argues that humanity is missing valuable AI guardrails that are in plain sight: the SDGs. They are current, comprehensive, strong, stringently researched, and immediately applicable. They already have global legitimacy and are not centralised and imposing. These are just a handful of reasons why the SDGs can play a crucial role; there are 15 reasons why we should use SDGs for governing AI.

Digital identification schemes gain traction

Actors worldwide are pushing for more robust, secure and inclusive digital ID systems and underlying policies.

Businessman using fingerprint identification to access and protecting personal information data

The OECD Council approved a new set of recommendations on the governance of digital identity centred on three pillars. The first addresses the need for systems to be user-centred and integrated with existing non-digital systems. The second focuses on strengthening the governance structure of the existing digital systems to address security and privacy concerns, while the third pillar addresses the cross-border use of digital identity.

Most recently, the EU Parliament and the Council reached a preliminary agreement on the main aspects of the digital identity framework put forward by the Commission in 2021. Previously, several EU financial institutions cautioned that specific sections of the regulation are open to interpretation and could require significant investments by the financial sector, merchants, and global acceptance networks.

At the national level, a number of countries have adopted regulatory and policy frameworks for digital identification. Australia released the National Strategy for Identity Resilience to promote trust in the identity system across the country, while Bhutan endorsed the proposed National Digital Identity Bill, except for two clauses that await deliberation in the joint sitting of the Parliament. The Sri Lanka Unique Digital Identity Project (SL-UDI) is underway, and the Thai government introduced the ThaID mobile app to simplify access to services requiring identity confirmation.

Content moderation: gearing up for the DSA

Preparations for the DSA are in full swing, even though the European Commission has already faced its first legal challenge over the DSA, and it did not come from Big Tech as many would have expected. German e-commerce company Zalando filed a lawsuit against the Commission, contesting the categorisation of Zalando as a systemic, very large platform and criticising the lack of transparency and consistency in platform designation under the DSA. Zalando argues that it does not meet the requirements for such classification and does not present the same systemic risks as Big Tech.

Meanwhile, European Commissioner for Internal Market Thierry Breton visited Big Tech executives in Silicon Valley to remind them of their obligations under the DSA. Although Twitter owner Musk previously said that Twitter would comply with the DSA content moderation rules, Breton visited the company headquarters to perform a stress test to evaluate Twitter’s handling of potentially problematic tweets as defined by EU regulators. Breton also visited the CEOs of Meta, OpenAI, and Nvidia. Meta agreed to a stress test in July to assess the EU’s online content regulations, the decision prompted by Breton’s call for immediate action by Meta regarding its content targeting children.

People, Person, Crowd, Adult, Male, Man, Face, Head, Audience, Lecture, Indoors, Room, Seminar, Speech, Thierry Breton — European Commissioner for Internal Market Thierry Breton. Credit: European Commission

The potential of the EU to exert its political and legal power over Big Tech will be demonstrated in the coming months, with the DSA becoming fully applicable in early 2024.

ChatGPT and GDPR: Balancing AI innovation with data protection

By Feodora Hamza

OpenAI’s ChatGPT has gained widespread attention for its ability to generate human-like text when responding to prompts. However, after months of celebration for OpenAI and ChatGPT, the company is now facing legal action from several European data protection authorities who believe that it has scraped people’s personal data, without their consent. The Italian Data Protection Authority has temporarily blocked the use of ChatGPT as a precautionary measure, while French, German, Irish, and Canadian data regulators are also investigating how OpenAI collects and uses data. In addition, the European Data Protection Board set up an EU-wide task force to coordinate investigations and enforcement concerning ChatGPT, leading to a heated discussion on the use of AI language models and raising important ethical and regulatory issues, particularly those involving data protection and privacy.

Concerns around GDPR compliance: How can generative AI comply with data protection rules such as GDPR?

According to Italian authorities, OpenAI’s disclosure regarding its collection of user data during the post-training phase of its system, specifically chat logs of interactions with ChatGPT, is not entirely transparent. This raises concerns about compliance with General Data Protection Regulation (GDPR) provisions that aim to safeguard the privacy and personal data of EU citizens, such as the principles of transparency, purpose limitation, data minimisation, and data subject rights.

As a condition for lifting the ban it imposed on ChatGPT, Italy has outlined the steps OpenAI must take. These steps include obtaining user consent for data scraping or demonstrating a legitimate interest in collecting the data, which is established when a company processes personal data within a client relationship, for direct marketing purposes, to prevent fraudulent activities, or to safeguard the network and information security of its IT systems. In addition, the company must provide users with an explanation of how ChatGPT utilises their data and offer them the option to have their data erased, or refuse permission for the program to use it.

Electronics, Hardware, Computer Hardware — Padlock symbol for computer data protection system. Source: Envato Elements

Steps towards GDPR compliance: OpenAI’s updated privacy policy and opt-out feature

OpenAI has updated its privacy policy, describing its practices for gathering, utilising, and safeguarding personal data. In a GPT-4 technical paper, the company stated that publicly available personal information may be included in the training data and that OpenAI endeavours to ensure people’s privacy by incorporating models to eliminate personal data from training data ’where feasible’. In addition, OpenAI allows now for an incognito mode on ChatGPT to enhance its GDPR compliance efforts, safeguard users’ privacy, and prevent the storage of personal information, granting users greater control over the use of their data.

The company’s choice to offer an opt-out feature comes amid mounting pressure from European data protection regulators concerning the firm’s data collection and usage practices. Italy has demanded OpenAI’s compliance with the GDPR by April 30. In response, OpenAI implemented a user opt-out form and the ability to object to personal data being used in ChatGPT, allowing Italy to restore access to the platform in the country. This move is a positive step towards empowering individuals to manage their data.

Challenges in deleting inaccurate or unwanted information from AI systems remain

However, the issue of deleting inaccurate or unwanted information from AI systems in compliance with GDPR is more challenging. Although some companies have been instructed to delete algorithms developed from unauthorised data, eliminating all personal data used to train models remains challenging. The problem arises because machine learning models often have complex black box architectures that make it difficult to understand how a given data point or set of data points is being used. As a result, models often have to be retrained with a smaller dataset in order to exclude specific data, which is time-consuming and costly for companies.

Data protection experts argue that the OpenAI could have saved itself a lot of trouble by building in robust data record-keeping from the start. Instead, it is common in the AI industry to build data sets for AI models by scraping the web indiscriminately and then outsourcing the work of removing duplicates or irrelevant data points, filtering unwanted things, and fixing typos. In AI development, the dominant paradigm is that the more training data – the better. OpenAI’s GPT-3 model was trained on a massive 570 GB of data. These methods, and the sheer size of the data set, mean that tech companies tend to not have full understanding of what has gone into training their models.

While many criticise the GDPR for being unexciting and hampering innovation, experts argue that the legislation serves as a model for companies to improve their practices when they are compelled to comply with it. It is presently the sole means available to individuals to exercise any authority over their digital lives and data in a world that is becoming progressively automated.

The impact on the future of generative AI: The need for ongoing dialogue and collaboration between AI developers, users, and regulators

This highlights the need for ongoing dialogue and collaboration between AI developers, users, and regulators to ensure that the technology is used in a responsible and ethical manner. It seems that ChatGPT is facing a rough ride with Europe’s privacy watchdogs. The Italian ban seems to have been the beginning, since OpenAI has not set up a local headquarters in one of the EU countries yet, exposing it to further investigations and bans from any member country’s data protection authority.

However, while the EU regulators are still wrapping their head around the regulatory implications of and for generative AI, companies like OpenAI continue to benefit and monetise from the lack of regulation in this area. With the EU’s Artificial Intelligence Act being passed soon, the EU aims to address the gaps of the GDPR when regulating AI and inspire similar initiatives being proposed in other countries. It seems the impact of generative AI models on privacy will probably be on the regulators’ agenda for many years to come.

How search engines make money and why being the default search engine matters

By Kaarika Das and Arvin Kamberi

Samsung, the maker of millions of smartphones with preinstalled Google Search, is reportedly in talks to replace Google with Bing as the default search provider on its devices. This is the first instance of a threat confronting Google’s long-standing dominance over the search business. Despite Alphabet’s diversified segments, its core business and majority profit accrue from Google Search, which accounted for US$162 billion of US$279.8 billion of Alphabet’s total revenue last year. Naturally, Google’s top agenda is to protect its core business and retain its position as the default search engine in electronic devices like tablets, mobiles, or laptops.

A critical question arises about the underlying business model of online search engines like Google, Bing, Baidu, Yandex, and Yahoo. What do these search engines stand to gain by being the default devices search engine? Let us examine how search engines generate revenue while allowing users to explore the internet for information and content for free.

The profit model of search engines

Search engines make money primarily through advertising (billions of dollars yearly from its Google Ads platform). The working mechanism is as follows: Whenever users can enter a search query into a search engine, the search engine provides a list of web pages and other content related to the search query, including advertisements. Advertisers pay search engines to display sponsored results when users search for specific keywords. These ads typically appear at the top and/or bottom of Search Engine Results Pages (SERPs) and are labelled as ‘sponsored’ or ‘ad’. Search engines get paid based on the number of clicks these ads get. This model is popularly known as the PPC (Pay-Per-Click).

Apart from sponsored listing, search engines also track user data for targeted advertising, using people’s search history. Search engines can easily gather information about users’ search history, preferences, and behaviours. This is done through cookies, IP address tracking, device and browser fingerprinting, and other technologies. Search engines then use these data points to profile their users to improve the targeting of advertisements. For example, if a user frequently searches for details about recipes and food, the search engine may display advertisements for restaurants and related food ingredient products. Thus, the user search history effectively helps improve search engine algorithms and enhances search accuracy by identifying patterns in user behaviour. In capitalising on user data, search engines allow advertisers to manage their advertisements using strategies such as ad scheduling, geotargeting, and device targeting – all made possible because of accumulated user history data!

Google, magnifying glass — Google making money from search engine. Image generated by DALL-E/OpenAI.

The power of default

Let us now delve into the edge granted to a search engine by being the default setup. Regardless of the default search engine, people can always change their search engine on their respective devices based on personal preferences. Despite the absence of any exclusivity, there is massive inertia to change the default search engine. It happens because the effort required to manually navigate to a different search engine to perform search functions makes the transition process a hassle, especially for ordinary people. Parallelly, technologically challenged people may not be aware of alternative search engines and might have no explicit preference for a specific search engine. Even with awareness of alternatives, the effectiveness, performance, and security of the search engine paired with their current device remains unapproved and may lead to apprehension among users.

Therefore, a default search engine further provides a sense of security (however misleading) as its performance and device compatibility are assumed to be vetted by the manufacturers. As a result, being the default search engine is advantageous for search engines as it provides them with a broader audience base leading to increased traffic alongside greater brand recognition. Thus, being the default search engine is vital for a search engine’s success as having large traffic ensures that search engines remain attractive to advertisers, their primary source of revenue – the higher the number of search engine users, the dearer the advertising space becomes, generating better returns.

For users, however, pre-installed search engines deprive them of the choice to select their preferred alternative and select those search engines that do not track user details. In 2019, the European Commission stated that Google had an unfair advantage by pre-installing its Chrome browser and Google search app on Android smartphones and notebooks. To circumvent antitrust concerns, in early 2020, Google enabled Android smartphones and tablets sold in the European Economic Area (EEA) to show a ‘choice screen’ that offered users four search engines to choose from.

While Google pays billions to device manufacturers like Samsung and Apple to remain the default search engine, the ongoing AI leap in the industry has enormous ramifications for the future of internet search and its ensuring business model. With unprecedented developments in AI and search engine functionality integrated with AI, the tussle of search rivals battling for popularity and influence is set to continue.