Ce topic appartient à l'appel INDUSTRY
Identifiant du topic: HORIZON-CL4-INDUSTRY-2025-01-DIGITAL-61

AI Foundation models in science (GenAI4EU) (RIA)

Type d'action : HORIZON Research and Innovation Actions
Date d'ouverture : 22 mai 2025
Date de clôture 1 : 23 septembre 2025 00:00
Budget : €30 000 000
Call : INDUSTRY
Call Identifier : HORIZON-CL4-2025-01
Description :

Expected Outcome:

  • Accelerate research and development in science, with focus on the domains of a) materials science, b) climate change science, c) environmental pollution science (including PFAS) and d) agricultural science ;
  • Advance AI technology (not limited to Generative AI) tailored for scientific needs and potentially adaptable to other tasks in the area of application;
  • Contribute to the development of foundation models in the areas of application, and pave the way for future funding of foundation models in a broader range of scientific disciplines;
  • Advance solutions to societal or scientific challenges;
  • Bridge existing knowledge gaps and induce interdisciplinarity by design across different fields necessary to advance the area of application; and
  • Support open-source and open science, especially for research communities with limited access to modern AI tools.

Scope:

Foundation models in science are an evolving idea in the scientific community and go beyond the Generative AI trend[[Some examples in science include: Foundation model in materials science ([2401.00096] A foundation model for atomistic materials chemistry (arxiv.org), Helmholtz Foundation Models Initiative ( Helmholtz Foundation Model Initiative - Helmholtz Home), The Trillion Parameter Consortium (https://www.anl.gov/article/new-international-consortium-formed-to-create-trustworthy-and-reliable-generative-ai-models-for), NASA (NASA and IBM Openly Release Geospatial AI Foundation Model for NASA Earth Observation Data | Earthdata), the University of Michigan (Scientific Foundation Models (scifm.ai))]]. The purpose of this topic is to tap into their potential, and to advance the development of AI technology specifically tailored for the needs of science.

A foundation model[1] can integrate information from various modalities of data. This model can then be adapted to a wide range of downstream, more specialized tasks. To build downstream applications, the foundation model is fine-tuned with additional training and task-specific examples. Therefore, a foundation model is itself incomplete but serves as the common basis from which many task-specific models can be built via adaptation.

In science, such foundation models could be trained on data from a specific scientific field and then be fine-tuned for a variety of tasks and used by a wider community in the field.

Proposals should address one of the following scientific domains:

  • (A) Materials science: the development of new, innovative and advanced materials is essential for EU’s economic security and for achieving a competitive and sustainable industry (especially sectors such as energy, mobility, construction, health and electronics). Employing AI in the process of materials design, characteristics and discovery could significantly accelerate and scale potential innovative solutions.
  • (B) Climate change science: advancing climate research is critical for achieving the EU's climate neutrality and resilience goals. AI foundation models can contribute to more accurate insights into climate dynamics, enhanced predictions of extreme weather events, regional impacts and the evolution of climate tipping points.
  • (C) Environmental pollution sciences: advancing environmental sciences can support the detection and characterisation of pollution sources, as well as their pathways, distribution and impacts to the environment and human health. This is particularly relevant in the case of pollutants of concern, emerging and/or less known pollutants.
  • (D) Agricultural sciences: advancing agricultural sciences research is critical to achieve a competitive, resilient and sustainable agricultural system. AI foundation models can contribute to enhance crop, livestock, soil and water management.

Proposals should focus on 1) developing foundation models (not limited to Generative AI) for science in the chosen domain; 2) showing a foundation model’s usefulness by adapting it to subtasks/scientific problems in the chosen domain; and 3) illustrating other possible areas of application.

The foundation models should provide researchers with access to essential AI-enabled capabilities for scientific discovery; employ the machine learning algorithms, models and architectures best suited for the chosen domain; be adaptable to different problems in the domain[2]; and be based on a robust and reliable architecture, as any potential errors and problems would be propagated to the downstream applications.

The foundation models should be placed at the disposal of the scientific community as open models, including the source code and, where possible, training datasets and other associated assets needed for full reusability of the foundation models (unless justified otherwise). This will serve a wider scientific community, thus broadening access to such scientific infrastructure and facilitating the use and adaptation of the model to different problems. Proposers should provide a clear documentation on the use and limitations of the model, alongside case studies demonstrating the model's application to a variety of tasks/problems in the chosen domain.

Multidisciplinary research activities should involve both AI and domain scientists, and address some of the following:

  • Conceptualisation and planning: the scope, objectives and expected outcomes of the foundation model;
  • Suitable interfaces for domain experts without computer science background to contribute to and utilise the outcomes;
  • Data identification, collection and management of (preferably diverse, multimodal) datasets through semantically annotation data schemas;
  • Model development, validation, testing under relevant operational and environmental conditions (such as thermal gradients, fatigue, corrosion, etc.) and, as appropriate, model evaluation and benchmarking, for example DOME[3];
  • Integration of domain knowledge into the model (for example through machine readable representations like RDF (Resource Description Framework).

Proposals should:

  • Prove access to high quality (multimodal) data needed for the development of the model. If in the process of developing the model, there is a need to create new data sets or adapt existing ones, they should follow the FAIR[4] principles. Describe the data curation and quality control procedures that will be used to ensure the accuracy, completeness, and consistency of the training data.
  • Contribute to efforts to reach common standards for data formats, metadata, taxonomies and ontologies.
  • Demonstrate a strategy[5] to access the computational resources needed for model training, evaluation/testing and inference.
  • Propose a model architecture that is designed with transparency in mind
  • Ideally, employ methodologies for integrating domain/interdisciplinary knowledge into the model and seek synergies with solutions that facilitate the managing and making sense of vast amounts of data (for example knowledge graphs).
  • Identify at least four possible use cases and scientific challenges that can be addressed with the model and its adaptations.[6]
  • Identify and assess the potential risks of misuse of the foundation model.
  • Propose a plan to make the model public, maintain and evolve it and promote it to the scientific community on a regular basis, in order to give visibility to the concept, discuss key findings and anticipate the technology evolution – possibly in synergy with other relevant projects.

Proposals should involve expertise in Social Sciences and Humanities (SSH), in the cases where legal and ethical experts should be involved to address data privacy, sharing agreements, and compliance with regulations.

Synergies with the selected projects from HORIZON-INFRA-2025-01-EOSC-06: Using Generative AI (GenAI4EU) for Scientific Research via EOSC are encouraged, where relevant. Proposals are encouraged to collaborate with established infrastructures such as the WeatherGenerator[7] project.

International cooperation is encouraged, where the EU has reciprocal benefit, like the Trillion Parameter Consortium.[8]

In this topic the integration of the gender dimension (sex and gender analysis) in research and innovation content is not a mandatory requirement.

[1] Foundation models is a term defined by the Center of Research on Foundation Models of Stanford University in: “On the Opportunities and Risks of Foundation Models”, https://arxiv.org/pdf/2108.07258.pdf

[2] An example in materials science, for inspiration only: *2401.00096.pdf (arxiv.org)

[3] https://dome-ml.org/

[4] Findable Accessible Interoperable Reusable data.

[5] In case the project plans to use the EuroHPC network, the EU-funded project EPICURE offers an application support service for EuroHPC: Epicure - European Commission (europa.eu)

[6] For materials science, examples include, but are not limited to: (for materials science) alternatives to hazardous materials like PFAS, materials that lower environmental footprint, materials for quantum technology, for higher capacity batteries, for more efficient photovoltaic devices, etc.; (for climate science) enhanced prediction of climate and weather extremes, early warning systems, forecasting of climate-driven migration, and monitoring of the global carbon budget, monitoring and measuring adaptation effectiveness; (for environmental pollution sciences) solutions for the detection and assessment of pollution, including pollutants of emerging concern; (for agricultural sciences) enhanced prediction of impact of plant pests, monitoring of animal health and welfare, monitoring of soil health or of water management in agriculture.

[7] https://weathergenerator.eu/

[8] Ref. Trillion Parameter Consortium (TPC) - Generative AI for Science and Engineering