Expected Outcome:
Project results are expected to contribute to the following outcomes:
- EOSC will make available the high-quality machine-readable scientific datasets to be consumed by machine-driven Generative AI applications at the service of science in line with the GenAI4EU[1] initiative and other key EU initiatives, like the Apply AI strategy.
- EOSC will facilitate the pooling and sharing of high-value data sets originated from EOSC and other data spaces identified as priorities (including, but not limited to, public sector, health, climate, environmental, manufacturing, agriculture, energy, financial and mobility data). The large-scale actions supported by EOSC will include the creation of common data platforms enabling secure and compliant sharing and reuse of sensitive, confidential, proprietary and personal data, as well as large-scale experimentation based on Generative AI, in line with the GenAI4EU initiative and other key EU initiatives, like the Apply AI strategy.
Scope:
The scope of this call is to demonstrate and foster the use of Generative AI for Scientific Research, in line with the GenAI4EU initiative and other key EU initiatives, like the Apply AI strategy, throughout the research data lifecycle supported by EOSC. Generative AI can be used for activities such as writing, data generation and analysis, reporting and many others, for improving productivity. This enables lifting science beyond the human scale by facilitating the deployment and use of smart algorithms, machine learning and AI services onto the Web of FAIR Data. The awareness and readiness of using Generative AI for scientific research must be raised by training activities.
AI-powered natural language interfaces can transform the way researchers interact with open science infrastructures, how they discover and combine relevant data, software and application assets. EOSC should evolve towards offering such capabilities in ways that ensure unbiased and trustworthy responses. This includes adopting FAIR practices, for AI-trained models as well, to address challenges ranging from reproducibility to trustworthiness.
Open Data and Open Research Software are essential for reliable, trustworthy, and transparent GenAI. They ensure that datasets and algorithms are well-documented, accessible, and reproducible, enabling others to validate and understand GenAI algorithms. This transparency fosters trust, supports ethical standards, and ensures compliance with regulations, particularly important in the field of GenAI.
The proposals shall focus on all following aspects:
- Enrich the EOSC federation with Generative AI tools for evaluating research data quality, ensuring trustworthiness across the European network of trusted repositories, accessible by humans, machines, and Generative AI services: formulate protocols and policies to facilitate effortless data access, processing, and provenance updates within EOSC's repository and service network.
- Support European research infrastructures to improve the FAIRness of their data, so that they are ready to be combined with data of infrastructures in scientifically neighbouring domains, in order to provide Generative AI-ready data.
- conduct pilots to validate the effectiveness and accuracy of the Generative AI-driven data quality evaluation methods, iteratively improving and refining them based on feedback and real-world use cases and removing the potential biases inherited from the training data.
- Run community engagement and support programmes for implementing Generative AI in scientific workflows via EOSC:
- promote a sound training programme to facilitate the uptake and the use of Generative AI as a means to facilitate the FAIRification of data and data curation;
- demonstrate how Generative AI can facilitate quality assessment of FAIR data;
- advance the realization of machine-actionable (MA) research data and services, including AI-based systems;
- propose protocols and policies to govern automatic data workflows within the network of repositories and services.
The proposals are expected to deliver on one or more of the following:
- Develop, promote and support real-life use cases for Generative AI models in scientific research domains, in line with the GenAI4EU initiative and other key EU initiatives, like the Apply AI strategy, such as:
- augment datasets in scientific fields that rely on image analysis, such as biology, astronomy, and materials science: by generating synthetic images that closely resemble real data, researchers can expand their datasets, improve model robustness, share anonymized version of sensitive data and generalize better to unseen scenarios;
- learn the underlying patterns of complex time-series data, such as sensor readings in environmental monitoring or physiological signals in healthcare: by generating data samples that match the learned distribution, these models can detect anomalies or deviations from normal behaviour;
- accelerate materials design and discovery by predicting the properties of new materials without the need for extensive experimental testing: these models can generate novel material structures with desired properties, such as strength, conductivity, or catalytic activity, based on learned relationships between material compositions and properties;
- advance drug design and molecular modelling by generating novel molecular structures with desired pharmacological properties: these models can explore vast chemical spaces, predict the interactions between molecules and biological targets, and optimize drug candidates for efficacy and safety;
- simulate complex systems and phenomena in various scientific domains, such as physics, chemistry, and ecology: by capturing the underlying dynamics and interactions of the system, these models can generate realistic simulations that mimic observed behaviour or predict future outcomes under different conditions.
The proposers should take into account and leverage on the results of relevant projects in the field, including AI4EOSC[2], iMagine[3], EOSC Data Commons[4], RI-SCALE[5], and other developments within the scope of the GenAI4EU initiative and other key EU initiatives, like the Apply AI strategy.
This topic implements the co-programmed European Partnership for the European Open Science Cloud.
Proposals could consider the inclusion of the European Commission's Joint Research Centre (JRC) research infrastructure in their research infrastructure portfolio for the creation and sharing or high-quality machine-readable scientific datasets to be consumed by machine-driven Generative AI applications. In this regard, the JRC will consider collaborating with any successful proposal.
[1] This call falls under the ‘GenAI4EU' initiative as in the Communication from the Commission to the European Parliament, the Council, the European Economic And Social Committee and the Committee of the Regions on boosting startups and innovation in trustworthy artificial intelligence ((COM(2024) 28 final of 24.1.2024).
[2] https://ai4eosc.eu/
[3] https://www.imagine-ai.eu/
[4] Grant no 101188179 from the call HORIZON-INFRA-2024-TECH-01
[5] Grant no 101188168 from the call HORIZON-INFRA-2024-TECH-01