Protection of children against sexual abuse in the circle of trust: Strategies

Policy Reports

Pushing the Boundaries of Open Science at CERN:
Submission to the UNESCO Open Science Consultation

Summary

CERN, founded in 1954, is the world’s largest high-energy physics laboratory, supported by 23 member states and hosting over 12,000 scientists and engineers. Its mission is to provide particle accelerator facilities, conduct fundamental physics research, and unite people to advance science and technology. CERN has long embraced Open Science principles, which were embedded in its founding Convention. The organization views Open Science not as an obligation but as a responsibility to member states and the global scientific community. CERN has been at the forefront of various Open Science initiatives, including the Open Internet, Open Source, preprint culture, and Open Access to scientific publications. However, the complexity of CERN’s research environment presents ongoing challenges in fully implementing Open Science practices. The laboratory produces vast amounts of data (about 90 petabytes per year) from particle collisions, requiring advanced tools and services for data stewardship, analysis, and preservation. In 2020, CERN’s governing body strongly endorsed Open Science in its updated European Strategy for Particle Physics, encouraging the particle physics community to help shape and implement Open Science policies. The paper aims to describe CERN’s ecosystem of initiatives, projects, and technologies developed to maximize research impact through Open Science infrastructure, potentially serving as an inspiration for the global scientific community.

DEFINITION OF OPEN SCIENCE: THE KEY PILLARS

Open Science is a comprehensive approach aimed at improving the traditional research ecosystem by promoting transparency, accessibility, and collaboration. It encompasses several key pillars:
1. Open Access: This involves making scientific publications freely available online. It grants users the right to access, copy, use, distribute, and create derivative works, subject to proper attribution. Publications should be deposited in online repositories that ensure long-term archiving and unrestricted distribution. 2. Open Data: The principle that research data should be freely available for everyone to use, reuse, and redistribute. This includes not only research datasets but also administrative and governmental data. The FAIR principles (Findable, Accessible, Interoperable, and Reusable) guide the implementation of Open Data.
3. Open Tools: This refers to making research software and tools publicly available under open licenses. It also encourages open and transparent software development processes, often using community development platforms like GitHub.
4. Open Notebook Science: This involves sharing the entire research process from the beginning, granting access to virtual research workspaces and providing insights into every stage of the research.
5. Open Source: This pertains to software that is publicly available under an open license, allowing users to modify, expand, use, or share the source code. It includes specific criteria such as free distribution, access to source code, and the ability to create derived works.
6. Open Research Assessment: This promotes transparency in the peer review process. It can involve revealing reviewer identities, making review reports publicly available, or allowing open participation in the review process from the broader community.
7. Citizen Science: This involves engaging members of the general public in scientific research, either in collaboration with or under the direction of professional scientists. It can benefit research by providing new perspectives or helping manage large-scale data collection tasks.

These pillars collectively aim to make scientific knowledge more accessible, the research process more transparent, and scientific collaboration more inclusive. They promote the free sharing of information, methodologies, and tools while respecting necessary data protections. By fostering openness and reproducibility, Open Science seeks to accelerate scientific progress and increase public trust and engagement in scientific endeavors.

OPEN ACCESS AT CERN

CERN has been at the forefront of the Open Access movement, with its roots in the high-energy physics community’s practice of sharing preprints for over six decades. This culture of openness was further enhanced by the creation of the World Wide Web at CERN in 1989 and the establishment of online research repositories like arXiv. In 2014, CERN formalized its Open Access policy, requiring all original high-energy physics results to be published openly. This policy was expanded in 2017 to include related fields such as instrumentation and scientific computing. As a result, by 2019, 89% of CERN’s research articles were freely accessible and reusable worldwide. CERN achieves Open Access through various agreements with publishers and its participation in the SCOAP3 project. The Sponsoring Consortium for Open Access Publishing in Particle Physics (SCOAP3) is a global collaboration hosted at CERN, involving over 3,000 institutions from 43 countries. It operates on a ‘redirection of funds’ model, removing financial barriers for authors and making nearly 90% of scientific articles in high-energy physics freely available. To support Open Access, CERN maintains several services, including INSPIRE, the core information system for high-energy physics, and the CERN Document Server (CDS), the institutional repository. These platforms provide access to a wide range of scholarly works, multimedia content, and administrative documents. The CERN Yellow Report series, started in 1955, continues to be an important medium for communicating CERN’s work in an open-access format. Through these initiatives and services, CERN demonstrates its ongoing dedication to the principles of Open Access, fostering global collaboration and the free exchange of scientific knowledge in the field of high-energy physics and beyond.

OPEN DATA AT CERN

CERN is home to the Large Hadron Collider (LHC) and its four major experiments, generates an enormous amount of data annually, making it one of the world’s most prolific sources of experimental data. The organization recognizes the importance of open access to this data and adheres to FAIR principles to maximize its scientific potential and fulfill its responsibility to member states and the global scientific community. CERN’s commitment to open data presents unique opportunities and challenges due to the complexity, value, and scale of the information generated. The organization has implemented policies and services to ensure responsible data sharing at various levels of abstraction and at different points in time. This approach allows for appropriate use, verification, and further research based on the data. The LHC experiments have adopted policies for data preservation and access, and CERN is currently developing an organizational Open Data policy to ensure consistency across experiments. This policy aims to make data publicly available after a suitable embargo period, enabling reuse by a wide community for various purposes, including research, education, and outreach. CERN has learned that simply making data available is not sufficient to achieve the benefits of open and reproducible research. As a result, the organization has developed a holistic approach to open data, considering openness throughout the research lifecycle. This includes not only the data itself but also associated software, workflows, and explanations to maximize scientific value and reproducibility. To support this approach, CERN has developed a suite of services and tools, including the CERN Analysis Preservation and Reuse framework. This framework consists of several interconnected services, such as CERN Analysis Preservation (CAP), Reusable Analysis Service (REANA), RECAST, HEPData, CERN Open Data portal (COD), and Zenodo. These services work together to facilitate data preservation, discoverability, reproducibility, and reuse across various stages of the research process. In conclusion, CERN’s approach to open data demonstrates a commitment to advancing scientific knowledge through responsible data sharing and preservation. By developing comprehensive policies and innovative tools, CERN is not only making its valuable data accessible but also ensuring that it can be effectively used and built upon by the broader scientific community.

OPEN TOOLS, OPEN SOURCE & OPEN HARDWARE

CERN has been at the forefront of the Open Source movement since the creation of the World Wide Web, which was initially developed to facilitate information sharing among scientific collaborations. The Open Source philosophy, which advocates for free access to technology’s building blocks, has been a cornerstone of CERN’s approach to software development. This approach has led to the creation and release of several influential Open Source projects, including Invenio, Indico, EOS, and the MAlt Project. Invenio, a digital repository system, and Indico, an event management tool, are prime examples of CERN’s commitment to developing and sharing Open Source solutions. These tools have found widespread use in scientific communities and beyond, demonstrating the value of open collaboration in software development. CERN’s contributions extend beyond software to hardware as well. The CERN Open Hardware Licence provides a legal framework for sharing hardware designs, fostering collaboration between institutions and industry. A recent example of this approach is the development of a 3D-printed mask during the COVID-19 pandemic, which was released as an Open Source solution. The organization’s commitment to Open Source is further exemplified by projects like Geant4 and ROOT, which have become essential tools in particle physics research and data analysis. These projects showcase how Open Source development can lead to robust, widely-adopted solutions in specialized scientific fields. CERN’s pioneering role in the Open Source movement has not only benefited the scientific community but has also had far-reaching impacts on society at large. By championing open access to knowledge and technology, CERN continues to drive innovation and collaboration in science and beyond.

CITIZEN SCIENCE AT CERN

CERN and the LHC experiments have been involved in a number of efforts to engage the general public in the scientific efforts undertaken at the Laboratory. Under the umbrella of CERN against COVID-19, CERN contributed computers for the combat against COVID using the Folding@home platform30. In addition, CERN has its own volunteer-computing platform called LHC@home31 where users can download software and run HEP simulations in a way that does not interfere with a computer’s normal operation. Citizen Cyberlab32 is a multidisciplinary platform that focuses on developing methods and studying motivations for new forms of public participation in research. Together with the UN Institute for Training and Research, UNITAR, and the University of Geneva, CERN organizes events from online crowdsourcing to in-person hackathon to develop Open Source research tools and training to increase the public involvement in scientific and social research. The Higgshunters project33 was a successful effort by ATLAS to motivate citizens to find interesting features in LHC data. In total over 37,000 citizen scientists articipated in the project and classified more than a million LHC events.

INTERNATIONAL ENGAGEMENT

CERN’s commitment to Open Science and international collaboration has been a cornerstone of its approach to scientific research and development. Over the years, CERN has actively participated in numerous European Commission projects, particularly within the Horizon 2020 framework. These projects have focused on areas such as Open Science, European research infrastructures, and e-infrastructures. By engaging in these initiatives, CERN has not only contributed to the Open Science discourse but has also maintained its position at the forefront of scientific innovation.

The organization’s involvement in projects like OpenAIRE, EOSC-hub, THOR, FREYA, and ESCAPE has allowed CERN and the broader High Energy Physics community to influence decision-making processes and stay informed about new developments in the field. This engagement has also facilitated the exchange of expertise with external partners across various domains, ensuring that CERN’s work remains relevant and cutting-edge. CERN has been proactive in adopting and supporting Open Science-related guidelines, including the FORCE11 Joint Declaration of Data Citation Principles, the FAIR Data Principles, and the Plan S Principles. The organization has also implemented recommendations from the Declaration on Research Assessment in its INSPIRE service.

These efforts demonstrate CERN’s commitment to promoting transparency, accessibility, and reproducibility in scientific research. Collaboration with external partners has been a key aspect of CERN’s approach to Open Science. Projects like the Asclepias project and the Biodiversity Literature Repository showcase CERN’s willingness to work with diverse organizations to advance scientific knowledge and make it more accessible to the global community. CERN has also taken a leading role in the implementation and adoption of Persistent Identifiers (PIDs). The organization has supported DOI versioning on its Zenodo platform and has made significant progress in increasing the adoption of mature PID types within community services. Additionally, CERN has been piloting new and emerging PIDs, such as ROR IDs for organizations and IDs for funders and grants, further demonstrating its commitment to enhancing the discoverability and interoperability of scientific information.

RECOMMENDATIONS

CERN proposes recommendations for UNESCO to promote global adoption of Open Science:
1. Implement open policies to guide research practices.
2. Align Open Access policies globally:
a. Ensure funded research is published with CC-BY licenses.
b. Simplify submission processes for Open Access publishing.
c. Transition to transformational agreements with publishers. d. Explore cooperative models for Open Access publishing.
3. Enhance Open Data practices:
a. Include all research artifacts with data releases.
b. Implement analysis preservation mechanisms.
c. Enable revalidation, reinterpretation, and reuse of data.
4. Promote Open Source Hardware and Software development.
5. Develop fair and transparent research assessment methods:
a. Focus on research impact rather than publication outlet.
b. Include metrics for data reuse and result application.

The COVID-19 global health pandemic has brought into sharp focus both the inadequacies of traditional science, but also the embrace of Open Science practices as the most effective mechanism for addressing global crises. It is our firm belief that we are at an inflection point, where Open Science should go beyond an aspiration, to become the default standard for global research. This will not only equip the scientific community to better respond to current and future challenges, but will further accelerate the advancement of the research enterprise itself, making it more efficient, transparent, responsive and dynamic.