[ICML paper] Navier-Stokes 2D
Overview
This issue aims at synchronizing and reporting the work related to the Navier-Stokes 2D use-case extracted from the FNO paper (see section 5.3 in Zongyi Li et al.).
Note: this paper was already introduced in the FNO MR.
In our context, the idea is to highlight the benefits of online training where the neural architecture is trained with more simulations than in an offline mode but where the network is only fed once by each trajectory. In other words: single epoch but same number of batches as offline training.
To do so, we need to:
- select an offline training experiment that will serve as a reference for comparison with the online mode,
- build a validation and a test set of respectively
N
andM
simulations to be determined.
Training experiments
Experiments are detailed in the table below (adapted from table 1 in Zongyi Li et al.). They were all performed with a mesh discretization of 64x64 and batch size of 20.
Experiment | Non-dimensional viscosity coefficient | Number of time steps | Number of simulations | Best architecture (Relative error) |
---|---|---|---|---|
1 | 1e-3 | 50 | 1000 | FNO-3D (0.0086) |
2 | 1e-4 | 30 | 1000 | FNO-2D (0.1559) |
3 | 1e-4 | 30 | 10 000 | FNO-3D (0.0820) |
4 | 1e-5 | 20 | 1000 | FNO-2D (0.1556) |
Note: one notices already that experiments 2 and 3 only differ in the number of simulations and yields significantly different level of preciseness.
Although the number of epochs is not specified for each experiment, the sources and the paper (see Figure 3) suggest that trainings were performed for 500 epochs. Hence an experiment involving 1000 simulations offline would require 500x1000=500 000 simulations in total. This number will probably be challenging enough to reach. 10 000 simulations experiments thus seem inadequate for now. In addition, given the fact that the simulation execution time is proportional to the total number of time steps and that the FNO-2D architecture was retained, experiment number 4 seems like the best candidate to investigate online training.
Note: although the offline training must be performed over 500 epochs, conclusive comparisons are feasible over less (loss displayed over 100 epochs only for instance). Hence, online experiments of 100 000 simulations only could be relevant as a first step.
Validation/test sets
In the original sources, the training is performed with a training and a test set only. They are respectively composed of 1000 and 200 simulations (see here). This means that 1200 simulations should be computed beforehand.
In the online mode, the test loss should then be computed every epoch equivalent batches i.e. every 1000/20=50 batches.
Note: as evident from the sources, there is no callback which means that the weights are check-pointed at the end of training and do not depend on the network's performance on the validation set. In this case, the validation set is therefore superfluous.
TO-DO list
-
offline computation of the training/test sets (1200 simulations), -
addition of the test loss computation to the online training algorithms ( FNO
andUNet
), -
performing offline training with FNO
, -
performing offline training with UNet
, -
performing online training with FNO
, -
performing online training with UNet
.