Adaptive sampling for training deep learning model with simulation data

Level: Master Level Research Internship (M2) or equivalent (stage fin étude ingénieur)
Where: UGA campus, Grenoble
When: 2022-2023, 4 months minimum
Financial support: a little more than 500 euros/month
Employer: INRIA and Univ. Grenoble Alpes.
Team: Datamove

Advisers: Bruno Raffin (Bruno.Raffin@inria.fr)

Context and Work

Simulation Based Inference (SBI) is emerging as a promising approach for Bayesian inference when a simulation code capable of generating synthetic observations is available. Compared to statistical methods like ABC, SBI proposes to rely on a neural architecture to learn the likelihood or the posterior probability distribution from data sets generated from the simulation. The capabilities of neural networks to better cope with the curse of dimentionality than other statistical methods, combined with the progress on neural architectures, led to very promising results. Amongst the many different open research directions to push forward SBI, one is active learning. The SBI neural architecture is trained on synthetic data generated with a simulation code. This makes a huge difference compared to traditional neural network training performed on fixed, limited datasets as for
image or natural language applications. Synthetic data can be generated at will, in potentially unlimited amounts, the quality can be controlled, and the coverage of the parameter space can be adapted to generate data that focus training where relevant. Active learning is indeed concerned with this latter point, where an adaptive process observes the behavior of the training process and, while training is taking place, control the sampling process of the input parameters of the simulator to generate data that are hopefully more relevant for training. The expected benefits are 1)speeding-up and 2) increasing the quality of training. One direct way to enable active learning in SBI is to use the posterior estimate being learned as a prior to generate the next data point to be integrated in the training set. Today active learning for SBI is usually based on simple phased algorithm as follow: 1) generating an initial training set usually uniformly sampling input parameters 2) (re)train on this data set 3) use the current neural network as prior to generate a new data set and return to 3). But more advanced approaches are still to be investigated.

Other machine learning have been investigating active learning. Active learning is commonly used for reinforcement learning where we find, in the actor/critic model, a simulator coupled with the currently trained policy used to control the generation of trajectories.
As they are produced , these trajectories are integrated in the data set used to train a new policy. When the new policy is evaluted as sufficiently better, it replaces the policy used in the actor. To be successful it is critical to sample data with a mix of what is called in reinforcement learning exploration/exploitation: a fraction of the samples sepcificaly targets the exploration of new areas of the simulation domain, while the remaining ones keep providing samples for already seen areas (exploitation of knowledge). This exploitation is important to avoid the phenomena of catastrophic forgeting, where the neural network, if only exposed to exploration data, will tend to forget about the old ones. Reinforcement learning training is also phased like SBI (the policy is updated periodically). This leads to a side effect called off-line policy where the data generated by the simulator are driven by a policy that is outdated compared to the one being used for training. This degrades the training quality and requires corrective measures.

Adaptive strategies are also emerging in the domain of Physics Informed Neural Networks. In that case there is no simulator. The neural network is trained on points sampled in the domain and the neural network tries to minimize the error at these points on the residual of the partial differential equation representing the physical process the neural network tries to approximate. Several strategies to enable the active sampling of the points have been developped with impresive gains in some cases. In these approaches the training loss is used as the metric to drive the sampling process following the simple idea that we need more samples where the loss is high. We find again there the need to keep exploring and exploiting points.

This intership is focused on investigating novel active learning approaches adapted to SBI. After getting familiar with SBI,
reading about existing work, the candidate will start to elaborate new active learing strategies for SBI and evaluate
their performance through experiments with simulators of growing complexity. If necessary the student will have access to supercomputers to enable faster trainings. Our team has developped a framework called Melissa to couple simulation-based data generation and online training on supercomputers (without active learning so far) that can be reused and extended.
We also have been working on active learning for Physics Informed Neural Networks and again the candidate will benefit from the
code base and experience acquired in this domain.

The internship will take place at the DataMove team located in the IMAG building on the campus of Saint Martin d’Heres (Univ. Grenoble Alpes) near Grenoble. The DataMove team is a friendly and stimulating environement gathering Professors, Researchers, PhD and Master students. Grants are available to pursue a PhD after this master intership.

The city of Grenoble is a student friendly city surrounded by the alps mountains, offering a high
quality of life and where you can experience all kinds of mountain related outdoors activities.

References
- The Frontier of Simulation-Based Inference. [[https://www.pnas.org/content/117/48/30055]]
- Adaptive Generation of Training Data for ML Reduced Model Creation. [[https://www.osti.gov/biblio/1923172]]
- A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. [[https://arxiv.org/abs/2207.10289]]
- Mitigating Propagation Failures in Physics-informed Neural Networks using Retain-Resample-Release (R3) Sampling. [[https://arxiv.org/abs/2207.02338]]
- Deep Active Learning by Leveraging Training Dynamics. [[https://openreview.net/forum?id=aJ5xc1QB7EX]]
- Optimizing Sequential Experimental Design with Deep Reinforcement Learning. [[https://proceedings.mlr.press/v162/blau22a/blau22a.pdf]]
- Off-Policy Actor-Critic with Shared Experience Replay. [[https://proceedings.mlr.press/v119/schmitt20a.html]]
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. [[https://arxiv.org/abs/1802.01561]]
- Melissa: Simulation-Based Parallel Training. [[https://hal.science/hal-03842106/file/main.pdf]]