Identifiant du topic: HORIZON-INFRA-2024-EOSC-01-04

Long-term access and preservation infrastructure development for EOSC, including data quality aspects

Type d'action : HORIZON Research and Innovation Actions
Nombre d'étapes : Single stage
Date d'ouverture : 06 décembre 2023
Date de clôture : 12 mars 2024 17:00
Budget : €8 000 000
Call : Enabling an operational, open and FAIR EOSC ecosystem (2024)
Call Identifier : HORIZON-INFRA-2024-EOSC-01
Description :

ExpectedOutcome:

Project results are expected to contribute to all the following expected outcomes:

  • Practices, standards and tools for long-term preservation are mainstreamed in the EOSC ecosystem.
  • The emergence of a European distributed infrastructure for long-term preservation and access is adequately supported.
  • The sustainability of long-term preservation among the European scientific community is significantly enhanced.

Scope:

In the digital and data driven paradigm promoted by Open Science, data is at the core of the scientific process and its production grows at ever increasing rates. The volume of data currently is many-fold larger compared to just two years ago. In science there are many intermediate data objects being created through the phases of research activities and they are managed within the research data lifecycle in which curation and preservation are key elements to make accessible data that are interoperable and reusable. Costs and physical limitations of storage and service capacity lead to the difficult question of what is worth long term preserving narrowing the selection to data and other digital objects that will have long-term benefits to science and society.

Coordination to harmonise practices and standards within and across the different scientific fields and adequate infrastructures are necessary to implement the level of curation and preservation needed and to offer the related services that differ in practice and effort per discipline ant type of data.

Considering that European Open Science Cloud (EOSC) aims at addressing many of the challenges faced when dealing with ensuring the long-term preservation of data along with the growing uptake of FAIR principles, the proposals under this topic are expected to:

  • Establish a minimum set of practices and a general framework to identify what data is candidate to long term preservation based on their use, benefit and quality.
  • Support the creation of long-term preservation and access strategies and processes among the different scientific disciplines.
  • Engage and collaborate with domain specific networks, creating new ones where necessary, that will consolidate practices and standards, such as metadata and ontologies, that will strengthen long-term access and preservation and support reproducibility, integrity and validity.
  • Build upon existing services and enrich EOSC with tools to store and access digital data for long periods, automate and federate certain specialised curation and preservation tasks.
  • Create an expert curation network (discipline oriented) that will enhance and facilitate the curation process and the digital preservation actions to ensure data remain accessible as technology changes.
  • Identify within EOSC and consolidate a network of repositories and archives for long-term preservation to address economy of scale and better support the European science ecosystem. Such network will have to be a superset of the network of trusted repositories of which the development and coordination will be supported under the topic HORIZON-INFRA-2024-EOSC-01-03.
  • Capitalise on the results of the ARCHIVER project and address sustainability solutions to ensure long-term preservation services in the EOSC ecosystem.

Within the action of establishing “a minimum set of practices and a general framework to identify what data is candidate to long term preservation” the quality of the data is a factor that plays a pivotal role in developing strategies that will support the decision of what is worth to preserve for long-term. The technical quality of data is related to the structure of the information objects, their adherence to standards, the use of commonly identified formats and the completeness of metadata that describes it. On a deeper level, quality is also the assessment of its “fitness” for the intended scope and further reuse. Quality assessment to support preservation has to primarily consider the effective contribution of data for the intended end point and the need of long term access but also the intra- and cross-disciplinary interest for reuse in other contexts and the richness of documentation that enables reuse. Technical soundness is a necessary but not sufficient discriminant and other quality assessments need to be based on set of evaluation principles and indicators that need to be developed and largely adopted by the different scientific disciplines. Therefore, the proposals under this topic are additionally expected to:

  • Coordinate disciplinary networks where wide representations of universities, research performing organisations, digital repositories, building on existing practices, will
    • Develop, and promote guidelines to produce high quality data;
    • Agree on standards to assess the quality of data and
    • Widely promote the above among the European research ecosystem.
  • Define, with experts from the disciplinary networks common requirements for data quality that are valid across disciplines.

The selected proposals will be expected to align with the EOSC Partnership and to coordinate an collaborate with the projects funded under the topic HORIZON-INFRA-2024-EOSC-01-03 with regards to the interconnection of repositories and other archiving infrastructure, and with the projects funder the topic HORIZON-INFRA-2023-EOSC-01-02, especially with regards to the quality dimension explored under that topic.