The session was opened by Mr Emmanuel Letouzé (Director, Data-Pop Alliance), who started by introducing Open Algorithms (OPAL), a project at the intersection of big data and public good. OPAL is its pilot phase, with two telco companies - Orange and Telefonica in Senegal - having opened their databases to third parties. The project plans to go into its beta phase in the next two years, and expand from 2020 onwards. It features an algorithm database, an application programming interface (API,) and privacy by design, to comply with the General Data Protection Regulation (GDPR) and other laws.
The session was composed of a panel discussion on the rationale for OPAL and use cases. It was followed by a demonstration of the system.
Ms Natalie Grover (Global Program Manager, Data-Pop Alliance and OPAL Project) moderated the panel.
Ms Claire Melamed (Executive-Director, Global Partnership for Sustainable Development Data) shared her insights on how OPAL was relevant for the sustainable development goals (SDGs). She explained that reconciling commercial interests with public interests was a key policy question in big data. OPAL was therefore an experiment whose progress should be followed because it revolutionised the data economy, by bringing together public and private actors.
Mr Pedro de Alarcon (Head of Big Data for Social Good, Telefónica (Colombia)) described how they had deployed OPAL. They were guided by the principle of giving back to society and were currently allowing public agencies to query their data.
Mr Babacar Ndir (Director-General Agence Nationale de la Statistique et de la Démographie (ANSD)) traced the evolution of Senegal’s national statistics systems. While in the past analogue methods were used to collect data, digital methods were now being employed. The data could be matched with data in the hands of private sector companies, such as mobile network operators, to provide more comprehensive insights. He noted that their biggest challenge was improving capacity for digital data processing.
Ms Sandra Moreno (Technical Director of Geoestatistical Division, National Administrative Department of Statistics of Colombia (DANE)) described Colombia’s challenges in interpreting data. These included getting access to data from private companies, and the lack of capacity to process the data. For example, they could only analyse data for one city. She described OPAL as an opportunity to use algorithms to focus on selected problems. While the use of OPAL for statistical data was still experimental, she rooted for open data and public private partnerships in big data.
Mr Seynade Ousmane (Senior Economist, IPAR) also spoke of OPAL in relation to the SDG goals. In giving access to disaggregated data, the private sector had an opportunity to give back to society. He also gave an overview of OPAL’s governance model, known as the Council for the Orientation of Development and Ethics (CODE) whose main objective was to build ethics in OPAL use cases. There is also a capacity building aspect for users of the system.
De Alarcon showed a demonstration of the Pedro server in Colombia which was connected to the telcom server. The portal allowed users to ask a question and then choose an algorithm. Currently there are two algorithms: home detection and presence. Users could adjust the granularity level, per geographical area and time. The query returned a graphic census. In the second viewing, the query returned a mobility matrix. These answers could be downloaded in different formats such as PDF, JPEg and Geoson.
Following feedback from the community, they had developed an application programming interface (API) for researchers to design their data queries. The data was available for free for governments and they were exploring costs so that they could sell data to private companies. Participants also got to view the API. They were shown how to query the system using a token. The platform is open source and users could therefore improve it.
Mr Nicolas de Cordes (Vice President, Marketing Anticipation, Orange Group), presented OPAL’s architecture, demonstrating its privacy by design features. These included pseudonymisation, and the aggregation and anonymisation of data before it was accessed by the API. The system was also designed to run on a variety of databases from telcos to banks and insurance. Use case examples included a study of schools in Senegal where issues such as poverty and peace were correlated to literacy rates. The model revealed that the rate of teacher training, the number of schools and the concentration of roads, correlated to literacy rates. Outliers, such as those who were in a poor community but had high literacy rates were studied further to draw lessons for improvement of education.
Moreno also shared use cases. Before conducting the national population census in Columbia, they used mobile phone data to learn people’s activities so that they could organise their census operations around their schedules. They were planning to use the data to study the migration of people within Colombia, so as to understand migration patterns within and outside towns. This would be useful in anticipating people’s service needs.
The audience feedback included questions on funding of the system, as well as how the board would manage competitors within the same market. On methodology, a participant wondered how representative the data in the system was, and how it could be standardised for national statistical operations.