Internship 24/25: Combining Transformers and Normalizing Flows for Deep Surrogate Training
Combining Transformers and Normalizing Flows for Deep Surrogate Training
- Level: Master Level Research Internship (M2) or equivalent (stage fin étude ingénieur)
- Where: UGA campus, Grenoble (also possibility to be located at EDF R&D Saclay)
- When: 2024-2025, 4 months minimum
- Financial support: a little more than 500 euros/month
- Employer: INRIA and Univ. Grenoble Alpes.
- Team: Datamove
- Advisers: Bruno Raffin (Bruno.Raffin@inria.fr), Alejandro Ribes (alejandro.ribes@edf.fr) and Abhishek Purandare (abhishek.purandare@inria.fr)
Context
Deep surrogates are deep neural networks trained from data produced by a numerical scientific simulation code, like fluids dynamics, weather forecast, molecular systems, etc. Deep surrogate are expected to be faster and smaller than the original simulation. There exists a wide variety of neural architectures used for deep surrogates, like U-Net, FNO, GNN, etc. Deep surrogate show different capabilities for generalization. Some are trained from a single simulation data, others from multiple simulation instances configured with different input parameters. A new trend is to train fundational models for scientific applications, leading to a neural network capable of supporting different types of simulations. Often these fundational models are based on a visual transformer architecture adapted for scientific data. The transformer architecture brings two key features for deep surrogates (1) the attention mechanism enables to capture correlations between simulation time steps; (2) the tokenization with positional encoding of input data into small data patches make the architecture more flexible to the resolution of the input data. In parallel, normalizing flow architectures, that project one known probability distribution to an other only partially known through data, have interesting properties (1) the are invertible and thus can be used for solving inverse problems; (2) they convey a measure of uncertainties through the learned probability distribution, a very important information for scientific computing.
Internship Goals
The goals of this internship is to investigate how effective the combination of transformer and
normalizing flow architectures can be for training deep surrogates. We will consider as starting point several available architectures from the papers All-in-one simulation-based inference, Poseidon: Efficient Foundation Models for PDEs, ClimaX: A foundation model for weather and climate that will be analyzed, tested and eventually combined. For the purpose of experiments, we will integrate these models into the
Melissa framework developed in our team. Melissa enables to train deep surrogates on supercomputers
directly from the running simulations while they produce data. Melissa enables to train very efficiently on significantly more data that the classical offline approaches that store output data from simulations to files and then read them back for training. Melissa also simply makes it easier to train deep surrogates by combing data production and training in a unified workflow.
Work environment
The candidate will integrate the Datamove team located in the IMAG building on the campus of Saint Martin d’Heres (Univ. Grenoble Alpes) near Grenoble. The DataMove team is a friendly and stimulating environement gathering Professors, Researchers, PhD and Master students. The city of Grenoble is a student friendly city surrounded by the alps mountains, offering a high quality of life and where you can experience all kinds of mountain related outdoors activities.
But there is also the possibility to pursue this internship being located at EDF R&D, Saclay, close to Paris. EDF is one fo the largest electricity supplier in Europe and their Saclay R&D EDF labs one of the largest industrial research center in France. EDF are long term collaborators actively involved in the development of Melissa and deep surrogate related investigation. EDF also brings industrial grade use-cases related to electrical machines (produced by Code_Carmel) and hydrological studies (produced by the open source code Open-Telemac).
Our Related Publications
- MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning, SC AI4S 2024: https://hal.science/hal-04712480v1
- Training Deep Surrogate Models with Large Scale Online Learning, ICML 2023: https://hal.science/hal-04102400v1
- High Throughput Training of Deep Surrogates from Large Ensemble Runs, SC 2023, https://hal.science/hal-04213978v1
- Deep Surrogate for Direct Time Fluid Dynamics, Neurips 2023 Thirty-fifth Workshop on Machine Learning and the Physical Sciences. https://hal.science/hal-03451432v2
--