Data governance

AI and data governance

How does AI improve data management?

Undoubtedly, AI brings numerous advantages to data management, including improved data quality through automated processes that clean, standardise, and validate data. It can enhance data privacy and security by leveraging techniques like natural language processing (NLP) and machine learning (ML) to identify sensitive information and ensure compliance with data protection regulations. AI also assists in data classification and categorisation, making it easier to organise and retrieve data for better governance. Furthermore, AI enables data management automation by automating tasks such as data lineage tracking, data cataloguing, and data stewardship. This streamlines processes, increases efficiency, and ensures consistency in data management practices across systems and processes. AI also aids in data management risk assessment by analysing large datasets to detect patterns, anomalies, and potential risks. This allows organisations to proactively identify and mitigate risks related to data quality, unauthorised access, or breaches.

What are the challenges that AI brings to data governance?

Data management improvements are not without dangers for data governance. While AI algorithms can identify and mitigate biases in data, they can also inadvertently introduce or amplify biases. Data security and privacy risks emerge as AI relies on large volumes of sensitive information, making organisations vulnerable to malicious attacks or unauthorised access. Regulatory compliance becomes more complex as AI processes huge amounts of data, requiring organisations to navigate and meet legal obligations related to data protection and privacy. Ethical implications and responsibilities in data-driven organisations are also increasingly important due to regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). These regulations set minimum standards for data protection, and non-compliance can lead to severe penalties. But removing incorrect or undesirable information from AI systems to comply with GDPR or similar regulatory frameworks is challenging due to the complexity of machine learning models, which makes it difficult to identify and remove specific kinds of data distortions and errors. The rapid advancement of AI technology poses challenges in keeping data governance policies and regulations current. As AI evolves, new capabilities and risks emerge, requiring continuous monitoring and adaptation of data governance frameworks. Learn more on AI Governance.

Data governance refers to the governance of data between states and the management of international data flows. It involves the whole life cycle of data – from collection, processing, storage, use, security, and management of data.

Data governance: Moving away from a one-size-fits-all approach

As discussions on data governance mature, 2023 will see a departure from the one-size-fits-all approach towards conversations on how to regulate the different types of data, such as personal, corporate, public, health, etc. In parallel, this will require a holistic approach that takes into account the standardisation, security, human rights, and legal perspectives. For governments worldwide, 2023 could be a landmark year in their search on how to reconcile two aspects:

The need to ascertain sovereignty over critical and sensitive data that needs to be stored physically on national territories (registries, health data, etc.)

The fact that free flow of data across national and corporate borders facilitates economic development and contributes to the public good (e.g. environmental data)

Win-win solutions are of course ideal, but realistically, governments will have to make optimal trade-offs between the two. Data governance has grown from a mainly privacy-related issue to a multifaceted one, with implications reaching the economy, law enforcement, cybersecurity, and even geopolitics. Data-driven business models are growing fast and are becoming critical in all sectors of the economy, from manufacturing to services. The processing of personal data (information relating to an identified or identifiable person) also enables scientific advancements in fields that range from healthcare to autonomous driving. Large datasets help lawmakers enact effective public policies and power digital government. In such a context, the international flows of data have become increasingly important for states and companies, and are playing a key role in various treaties, conventions, and trade agreements around the world. The regulation of these flows, therefore, has become an important matter for stakeholders on a global scale. There are several reasons why a country might want to regulate its data flows: i) to safeguard the privacy of its citizens, as is the case for most data protection legislation; ii) to meet other regulatory objectives, such as access to information for auditing purposes; iii) for national and cybersecurity reasons; and iv) with the aim of developing domestic capacity in data-intensive sectors, as a form of digital industrial policy. Data governance has four main facets: technology, economy, security, and law and human rights.

Technology

The technological aspect of data governance refers to the development of standards, apps, and services for data management. Not to be confused with the ‘data governance’ term used for an area of corporate and technical governance, the technological aspect of data governance is concerned with ensuring interoperability between systems and actors, widespread adoption of standards, and proper development of technologies that deal with data and related activities.

Standardisation bodies, telecommunication companies, and governments engage in multiple fora to discuss how to better face challenges like the fragmentation of data space, lack of interoperability, portability, and low quality of data, among others.

National security

The national security side of data governance is concerned with how the data of its citizens might be used both to protect national security and to threaten it.

In the first case, governments can use large sets of data as a means of fighting crime, terrorism, and money laundering. The use of data to fight pandemics has recently surfaced as a national security concern as well.

In the second case, openly available data (such as data from social networks) might be used by foreign entities in ways that pose a threat and/or undermine the national sovereignty and interests of a country. The use of personal data to influence elections, for instance, represents a significant threat many governments have expressed worries about.

Law and human rights

Increasingly often, information considered vital for criminal or civil investigations finds its way outside the jurisdiction of the investigative authorities, straining existing international co-operation mechanisms (commonly known as Mutual Legal Assistance Treaties or MLATs) that are not considered efficient and timely enough for the current pace and volume of transnational investigations. This slow pace and uncertainty are undesirable and often encourage policymakers to adopt data localisation measures as the only means of solving the issue. However, other efforts are being made to create new solutions to this problem. Some national instruments like the US CLOUD Act have been established to facilitate international co-operation. Such regulations raise concerns amongst civil rights groups, who argue that these mechanisms might undermine privacy and rights against unreasonable searches of private data and information, since the government could enter into data sharing agreements with foreign countries and affected users would not be notified of the issuing of these warrants.

Economy

Data is the core economic resource of the new economy. Governments have been increasingly aware of the impact of data on national economies and have sought to implement policies to foster data-based industries within their borders. Key strategic technologies, like artificial intelligence (AI) and autonomous driving, are largely dependent on large inputs of data to be further developed and successfully implemented.

Some governments have stimulated specific high-tech sectors of their economies, attempting to create local ‘Silicon Valleys’ that can compete globally on the open market.

Other states have attempted more direct and protectionist approaches. Some have embedded data localisation measures in legislation so that data is stored within national borders. This is the case of Russia’s data localization law, which has been fining foreign companies hundreds of thousands of dollars for failing to store their data within Russian borders. Some states have implemented other related policies that force companies to establish facilities and subsidiaries in the country. This is often the case in the EU, where the General Data Protection Regulation imposes such restrictions on the destination of transborder flows that companies have no choice but to establish physical and legal presence in the EU. Others, like China, have actively banned foreign tech firms from operating within the country, on claims of protecting ‘Internet sovereignty’, as well as prohibiting the outward flow of its data as a means of protecting and ensuring a monopoly over its large volume of this key resource. The impacts of choosing to restrict the free flow of data or not is still uncertain. Some scholars argue that data localisation and data protectionism are bad for a country’s economy as they increase the costs associated with this important resource, while others argue that, due to long term strategic goals, restricting the free flow of data might be worth the short-term economic disadvantages.

Such situations raise the question of whether legislation, policies, and institutional incentives for the localisation of data, restriction of outward flows, or the deployment of national data-intensive industries might be a new form of economic protectionism, while the calls for the free flow of data between jurisdictions might be a new form of trade liberalism, each motivated by state and corporate economic and/or political agendas. Digital protectionism, as it has been labeled, has risen as a relevant concern to governments, international organisations, and companies alike. Either by restricting outward flows or by promoting the free flow of data, actors such as the US and China have been using data governance as an important instrument to achieve geopolitical and geoeconomic goals.