Fno use case (!64) · Merge requests · melissa / Melissa

SCHOULER Marc requested to merge fno-original into develop Dec 20, 2022

This MR adds a new use case based on the FNO paper.

The data generator solves the 2d Navier-Stoked equation for a viscous incompressible flow in voricity form on the unit torus (see section 5.3).

The considered architecture is the FNO-2D model with a recurrent structure to propagate in time (see fourier_2d_time.py here).

Useful resources regarding FNO and the vorticity simulation are available at the following links:

Use case description

As a first step, a configuration close to the first one in Table 1 was considered:

dimensionless viscosity coefficient nu=1e-3,
a total of T=20 (vs 50 in the paper) snapshots each corresponding to 1 second of physical time,
a mesh density of 64 by 64,
a total of N simulations.

Note: the actual time step is dt=1e-4 s but we only keep 1 snapshot every 10 000 time steps.

The specificity of this use case is that the neural network we are considering is trained to predict the 10 (vs 40 in the paper) next snapshots from the 10 first ones:

the operator aims at mapping the vorticity up to time 10 to the vorticity up to time T>10

In this context, each simulation only produces one set of data (x,y) where x=(u(0),..u(9)) is used to predict y=(u(10)..u(20)). The immediate consequences are:

In the sense of Melissa, there is only one communication taking place at the end of the simulation (when all snapshots are available).
The GPU is idle most of the time since one simulation takes more than a minute to run (on CPU).
In order to reproduce their experiment (500 epochs with 1000 simulations) with distinct data we would need to run 500 000 simulations (there's a 40 000 hard limit for the number of steps executable by one work on JZ).

Notes about the model:

Although FNO2d is recurrent (i.e predicts one time step at a time) and FNO3d is direct (i.e able to predict the 40 next snapshots in one shot), the FNO3d model normalizes its data which is inconvenient for our framework. FNO2d was then preferred.
The NCCL back-end does not support complexFloat so torch's DDP is not an option for this use case.

Edited Dec 20, 2022 by SCHOULER Marc

Admin message

Fno use case

Merge request reports