A tutorial on public policy essentials of data governance

29 Nov 2019 09:30h - 11:00h

Event report

[Read more session reports and updates from the 14th Internet Governance Forum]

The session shed light on important concepts for data governance, such as profiling, and revealed how some complex processes work without public awareness or scrutiny, such as the market of technology-based advertisements. Suggestions for technical and policy solutions in the field of data governance were put forward.

Our data is contextual and connected to us, affirmed Mr Jean F. Queralt (Civil Society, Asia-Pacific Group). This is important in the context of profiling. Profiling is the analysis, for specific purposes, of our interaction with Internet services. By means of profiling, companies try to understand how to manipulate individuals in order to extract value from them. In this way, an asymmetry of power exists between individuals and companies, because the latter own the infrastructure and the platforms we use, and so, they can enforce their policies against our will. If we buy an album online and try to share it in a way that disrespects copyright, for example, that album may vanish from our catalogue. This means that we do not truly own the things we buy anymore. Moreover, while data protection norms require platforms to delete personal data if an individual cancels an account, we have no guarantee that this data has actually been deleted. Data protection laws put the burden of verification on the shoulders of the users.

Personalised advertisement is the central business model of the Internet, turning the Internet into an infrastructure of surveillance. Online and offline information are collected to build individual profiles. Every time a user clicks on a website with ads, the user profile is sent to an auction network where advertisers bid for attention. The winners of those bids get to show us their ads. According to Mr Duncan McCann (Civil Society, Western European and Others Group (WEOG)), in the UK companies have 60 million profiles; and auctioning bid requests happen at a rate of 10 billion a day. The biggest global profilers are Oracle and Axium. Oracle has about 3 billion profiles and claims to have about 30 000 attributes on each.

A huge personal data flow – including the flow of sensitive data, such as mental health information, political affiliation, sexual preferences, and gender identification – goes from the profilers into the advertising industry. The market is highly concentrated: almost 90% of the digital ad revenue is spent within two companies, Google and Facebook. Profiling is not only important for advertising, but also for the development of algorithms that play an increasingly large role in influencing decisions in important areas such as job placement or crime prevention, as mentioned McCann. Algorithms can cause harm if they are badly designed, that is, if their use of information produces biased results or if they use incorrect information. Axium, for example, one of the leaders in the profiling business, publicly recognises that about 30% of the information they have on individuals is incorrect.

Another problem with the technical advertising industry is related to fraud. A large number of bots are designed to click on ads, so 56% of paid advertisements are never seen by a human. The marketing industry is facing considerable backlash from the European Commission for inflating the real benefit of marketing, which leads to a significant waste of financial resources. Because of these findings and of the regulatory pressure created by laws such as the General Data Protection Regulation (GDPR), some companies are moving away from ‘ad tech’ into contextual advertising, which may give them better results.

The relation between concentration of data and concentration of wealth and power was explained by Mr Deepti Bharthur (IT for Change). Mergers and acquisitions has led to a consolidation of data in the hands of a few players. Data holders exercise an asymmetrical power over other economic actors and societies. This data does not belong to private companies, but has been privatised by them because of lack of regulation. A wave of hyper-optimism based on technological ‘solutionism’ has paralysed policy makers. Regional trade agreements pushing for cross-border data flows risk further shrinking policy options for developing nations.

Some concrete suggestions were presented during the session. According to McCann, governments should fund an independent and decentralised digital identity system that would allow individuals to prove their identity online without giving away their personal data. It should be a co-operative identity system in which individuals have direct control over their data, deciding what attributes they want to include and what inferences can be made from their profiles. This independent organisation would stipulate how companies and governments could access the identity system when they need to understand our identity.

Queralt suggested that programmers should receive training on human rights during their education. They are not part of the discussion on human rights and, consequently, they are not aware of the harm that their products may cause. Programmers are the real next generation of human rights defenders, because they are the ones with the power to make real changes.

A researcher and consultant from Brazil working on data governance mentioned that websites use metadata to structure data, in order to standardise it and make it accessible in different search engines, for example. He suggested that personal data – which is now collected without any standardisation – should follow a similar approach. This would ensure that data portability is fully implemented and, ultimately, give individuals true ownership of their data. Companies do not have incentives to structure personal data in this harmonised way, therefore, governmental regulation is needed

Data governance needs to pursue the public interest. A way to achieve that, according to Bharthur, is community data: aggregated, de-identified personal data generated from a geographical or interest-based community. For example, small farmers in India have particular farming practices that they have developed over millennia. This is data generated by a community and it should belong to them, not privatised by a company like Monsanto. Community data should be governed as a collective resource or as a data commons.

By Marilia Maciel