Mentions légales du service

Skip to content
Snippets Groups Projects

Melissa Sensitivity Analysis

Table of Contents

About

Melissa is a file avoiding, fault tolerant and elastic framework, to run large scale sensitivity analysis on supercomputers. Largest runs so far involved up to 30k core, executed 80 000 parallel simulations, and generated 288 TB of intermediate data that did not need to be stored on the file system.

Classical sensitivity analysis consists in running different instances of the simulation with different set of input parameters, store the results to disk to later read them back from disk to compute the required statistics. The amount of storage needed can quickly become overwhelming, with the associated long read time that makes statistic computing time consuming. To avoid this pitfall, scientists reduce their study size by running low resolution simulations or down-sampling output data in space and time.

Melissa bypasses this limitation by avoiding intermediate file storage. Melissa processes the data online (in transit) enabling very large scale sensitivity analysis. Melissa is built around two key concepts: iterative (sometimes also called incremental) statistics algorithms and asynchronous client/server model for data transfer. Simulation outputs are never stored on disc. They are sent by the simulations to a parallel server, which aggregate them to the statistic fields in an iterative fashion, and then throw them away. This allows to compute oblivious statistics maps on every mesh element for every time step on a full scale study.

Melissa comes with iterative algorithms for computing the average, variance and co-variance, skewness, kurtosis, minimum, maximum, threshold exceedance, quantiles and Sobol' indices, and can easily be extended with new algorithms.

Melissa architecture relies on 3 interacting components:

  • Melissa runner (client): the parallel numerical simulation code turned into a client. Each client sends its output to the server as soon as available. Clients are independent jobs.
  • Melissa server: a parallel executable in charge of computing statistics. The server update statistics upon reception of new data from any one of the connected clients.
  • Melissa Launcher: the front-end Python script in charge of orchestrating the execution of the global sensitivity analysis. This is the user entry point to configure the sensitivity analysis study.

To run a sensitivity analysis with Melissa, a user needs to:

  • Instrument the simulation code with the Melissa API (3 base calls: init, send and finalize) so it can become a Melissa runner.
  • Configure the sensitivity analysis (how to draw the parameters for each simulation execution, statistics to compute)
  • Start the Melissa launcher on the front-end of the supercomputer. Melissa takes care of requesting resources to execute the server and runner, monitor the execution, restarting failing components when necessary.

As of now, Melissa is only provided with API compatible with solvers developed in C, Fortran and Python but can easily be extended to other languages (see API folder).

For more details on the Melissa model for sensitivity analysis refer to PDF.

News

  • February 2021 Update (Version 0.7)
    • Melissa additionally works on CentOS 7, CentOS 8, and Alpine Linux now
    • Renamed CMake variable ZeroMQ_DIR to ZeroMQ_ROOT; starting with CMake 3.12, CMake began transparently uses these variables to search for packages, before that many programmers were using this naming scheme in their CMake setup
    • Fixed the CMake target import
    • Made import of launcher as Python3 package easier
    • Reduced the amount of user-provided code in options.py needed to run simulations
    • Updated Slurm support for clusters without mpirun (cf. Spack package manager issue #10340)
    • Updated the Code_Saturne example for Code_Saturne 6
    • Improved code quality (e.g., more tests, fixed compiler warnings)
    • Removed deprecated code including Python2 code, obsolete Python modules, and old C functions
  • Jan 2020: GitHub continuous update
    • Sync our work repo with the github repo so all changes are immediately visible to all
    • Major code restructuring and documentation update
    • New tools for supporting a virtual cluster mode and using Jupyter notebook for controlling a Melissa run
  • Nov 2018: Melissa release 0.5 available on GitHub
    • Changes in the API: remove arguments from the melissa_send function
    • Add batch mode
    • Improve launcher fault tolerance
    • Improve the examples and the install tree
    • Many fixes
  • Nov 2017: Melissa release 0.4 available on GitHub
    • Improve quantiles and threshold exceedances
    • Add iterative skewness and kurtosis
    • Add a restart mechanism over results of previous study
    • Add FlowVR coupling mechanism for Sobol' groups
    • Add Telemac2D example, including FlowVR coupling mechanism
    • Many bug fixes

Melissa Documentation

Melissa documentation comes with the code and is spread in README.md files as close as possible to the concerned code. You are right now at the root README.md of this doc.

Getting Started

Install Melissa

Melissa can be installed with the Spack package manager or it can be built manually. We advise users to use the Spack package manager by default because it takes care of dependencies, installation paths, and environment variable updates. The Melissa Spack build is actively maintained by the Melissa developers. Spack does not require superuser rights.

Installation with Spack

Base Spack Install
  1. Install and set up Spack: Spack documentation: Getting Started
  2. Build Melissa:
  • Latest stable version:
spack install melissa
  • Latest development version:
spack install melissa@develop
  1. Load Melissa:
 spack load melissa
  1. Check your Melissa version:
melissa-config --version

You now have a working Melissa installation.

Spack and modules

If your supercomputer works with modules, you may want to have Melissa build depending on one of such module, typically the MPI module optimized for your supercomputer. Here is a basic example on how to proceed:

  • Load the module:

    $ module load openmpi/4.0.5
    
    Loading openmpi/4.0.5
    Loading requirement: intel-compilers/19.1.3
  • Tell Spack to search for installed (so-called "external") packages:

    $ spack external find
    
     ==> The following specs have been detected on this system and added to /home/user/.spack/packages.yaml
     openmpi@4.0.5
  • Afterwards the modules can be unloaded:

    module purge
  • The module may have been compiled with a specific compiler. Add the compiler to Spack:

     $ spack compiler find
    
     ==> Added 1 new compiler to /home/user/.spack/linux/compilers.yaml
      intel@19.1.3.304
     ==> Compilers are defined in the following files:
      /home/user/.spack/linux/compilers.yaml
  • Spack can only recognize a small number of packages automatically and in many cases it is advisable or even necessary to edit the list of available packages manually. The list of packages can be found at ~/.spack/packages.yaml. If you perform these modifications, you must find out

    • the path to the package or the relevant module(s) to load.

    • the Spack package name (e.g., the Spack package name of CMake is cmake)

    • the package version,

    • and the compiler that was used.

    The compiler dependency is very important because in the author's experience different compiler modules may depend on different versions of essential libraries such as libc or binutils. Using a package that was built with one compiler while a different compiler module is loaded may lead to crashes in the best case and silent data corruption or a waste of supercomputer CPU hours in the worst case.

    An example with Intel MPI:

    packages:
      gmake:
        externals:
        - spec: gmake@4.2.1
          prefix: /usr
      intel-mpi:
        externals:
        - spec: intel-mpi@2019.5.281%intel@19.0.5.281
          modules: [intel-all/2019.5]
  • In some environments existing packages must be used; they should never be built by Spack. For example on supercomputers, MPI must often be linked against libraries of the computers' batch scheduler for otherwise, MPI applications cannot be launched. You can prohibit building packages from scratch. Example:

    packages:
      mpi:
        buildable: false
  • Build Melissa with specific compiler and package dependencies:

     spack install melissa %intel@19.1.3.304 ^openmpi@4.0.5

Manual Installation

Melissa can be downloaded from Melissa repository on Inria GitLab or with the following command:

git clone https://gitlab.inria.fr/melissa/melissa.git

which will create a folder called "melissa."

The following dependencies must be installed before building Melissa:

  • CMake 3.7.2 or newer
  • GNU Make
  • A C99 compiler
  • An MPI implementation
  • Python 3.5.3 or newer
  • ZeroMQ 4.1.5 or newer

On debian based systems, these dependencies can be installed via:

sudo apt-get install cmake build-essential libopenmpi-dev python3.8 libzmq3-dev

The Melissa header for Fortran90 will be installed even if no Fortran compiler is present.

Next, create build and install directories and change to build:

mkdir build install
cd build

Call CMake and customize the build by passing build options on its command line (see the table below). If you are unsure if all dependencies are installed, simply run CMake because it will find all required software packages automatically and check their version numbers or print error messages otherwise. The build here has compiler optimizations enabled:

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../install ../melissa
make
make install

Users installing on a local machine may benefit from including the -DINSTALL_ZMQ=ON flag. Meanwhile, cluster users should ensure the cluster installed ZeroMQ is linked. Additionally, users can expedite the make process by passing the -j flag appended with the number of available cores on the compilaton machine. (.e.g. make -j6 for a machine with 6 cores.)

Update environment variables to ensure the Melissa launcher and server can be found by the shell:

source ../install/bin/melissa-setup-env.sh

This command needs to be executed whenever you start a new shell.

Build Options
CMake option Default value Possible values Description
-DCMAKE_BUILD_TYPE -- Release, RelWithDebInfo, MinSizeRel, Debug Select development or release builds
-DCMAKE_INSTALL_PREFIX ../install (any location on the filesystem) Melissa installation directory
-DZeroMQ_ROOT -- The path to the ZeroMQ installation directory (only needed for nonstandard paths)
-DINSTALL_ZMQ OFF ON, OFF Download, build, and install ZeroMQ
-DBUILD_DOCUMENTATION OFF ON, OFF Build the documentation (requires Doxygen)
-DBUILD_TESTING ON ON, OFF Build tests; run with make test in build directory

Run a first example

Go to the heat example README.md to run your first sensitivity analysis with melissa

Sensitivity Analysis with Melissa

A sensitivity analysis with Melissa requires:

  • simulation code augmented with Melissa function calls,
  • a configuration file with the desired statistics,
  • how to launch MPI jobs, and
  • calling the Melissa launcher.

Before going further, the reader is strongly advised to take a look at the heat-pde tutorial.

Augmenting the Simulation Code

From the point of view of Melissa, a simulation manages the state of one or more fields or quantities (e.g., energy, temperature, or pressure). Each field or quantity can have its own mesh but these meshes must be fixed. For each kind of value to be analyzed by Melissa, the simulation must call

#include <melissa/api.h>
melissa_init("value-kind", grid_size, mpi_communicator);

The MPI communicator must be application specific (see the MPI_Comm_get_attr documentation regarding the property MPI_APPNUM). For every time step and for every kind of value, the simulation must call

#include <melissa/api.h>
const double* values = ...;
melissa_send("value-kind", values);

Keep in mind that one time step for Melissa does not have to equal one time step in the simulation. After all data was sent, the simulation must call

#include <melissa/api.h>
melissa_finalize();

This statement is obligatory and must take place before MPI_Finalize() is called.

Melissa Options File

For a sensitivity analysis, the user must decide

  • which statistics you want Melissa to compute,
  • which input variables shall be part of the sensitivity analysis, and
  • which output variables shall be part of the sensitivity analysis.

This information must be passed to Melissa in a Python file commonly called options.py. It is suggested to adopt one of the options files given in the example folder:

  • For every desired statistic, set the right-hand side to True. For all other statistics, set them to False.
  • For every input variable part of the sensitivity analysis, generate a value in the function draw_param_set(). The Python random or the NumPy random module may be helpful.
  • For every output variable part of the sensitivity analysis, pass its name to the array STUDY_OPTIONS['field_names'].

The function draw_param_set() must return an array of double-precision floating-point values, e.g., a Python list of floats or a NumPy array. The function will be invoked repeatedly and every invocation must return the same number of values.

MPI Job Launcher

Melissa knows several ways to launch MPI jobs. Local jobs can be started with OpenMPI, on a cluster the batch schedulers Slurm and OAR can be employed. Before the user runs the sensitivity analysis, they must decide on an available method.

Starting the Sensitivity Analysis

The sensitivity analysis can be launched with the aid of the Melissa launcher. First, ensure the Melissa launcher can be executed by updating the environment variables. Locate the Melissa installation directory and execute

source bin/melissa-setup-env.sh

The Melissa command line has the following structure:

melissa-launcher openmpi options.py simulation

The first argument (here: openmpi) selects the way MPI jobs are started, the second argument is the path to the Melissa options file, and the third argument is the name of the simulation executable. Paragraphs below discuss how Melissa locates simulation executables and how the simulation is called by Melissa.

Executables should be placed on the environment PATH variable, otherwise the user should use the full path with the executable name. Users can test if the executable is on the PATH by typing the simulation name into a terminal, e.g.: code-sol --help, where code-sol is the name of the simulation executable. If this does not work, the user can provide the absolute path to their executable on the launcher command line or they should update their PATH with export PATH="/path/to/code-sol-exe:$PATH". Users should keep in mind that this environment variable is not persistent unless they add the command to the .bashrc file one time, which can be done with:

echo 'export PATH=/path/to/code-sol-exe:$PATH' >> ~/.bashrc 

Melissa sets up simulations based on the file options.py which is passed as an argument on the command line. For the sensitivity analysis, Melissa must run the simulation with the input values generated by the user-provided function draw_param_set() (see the description of the Melissa options file above). Melissa will pass the input values on the simulation command line. For example, if there are two input values 3.14 and 159.0, then Melissa will run simulation 3.14 159.0. If the user needs to customize the simulation launch, then they can replace the actual simulation with a script that calls the simulation. Such a script will be called the simulation script for the purposes of this README. For example:

#!/bin/sh

set -eu
exec code-sol --input "$1" --input "$2"

The two requirements for such a script are that

  • it accepts the simulation input values on the command line and
  • it is executable, i.e., the executable bits are set.

Pay attention to the fact that every MPI job will execute the script! On a computer cluster, the different script executions may take on different computers.

In some cases the simulation needs a special setup. For example, certain directories may need to be present or the input values must be written to files. The Melissa launcher can execute the simulation script in two steps if the argument --with-simulation-setup is passed on the command line: melissa-launcher --with-simulation-setup openmpi options.py ./code-sol.sh. The simulation script will be called twice then:

  • In the first step, the simulation script is called on the user's computer once and the additional argument initialize will be passed on the command line in addition to the simulation's input values (e.g., ./code-sol.sh initialize 3.14 159.0).
  • After the simulation script exited successfully, the launcher will start the MPI jobs. Every MPI job will invoke the simulation script with the argument execute and the input parameters (e.g., ./code-sol.sh execute 3.14 159.0).

Command Reference

Launcher

The launcher command line looks as follows:

melissa-launcher <scheduler> <options> <simulation>

Scheduler is one of the available schedulers (call melissa-launcher --help for an up-to-date list), options is a path to a Python file with the Melissa options, and simulation is the name or the complete path to a simulation executable.

--help

The launcher shows an overview of all command line options, all available schedulers and exit.

--version

The launcher shows the Melissa version and exit.

--output-dir <directory-template>

A sensitivity analysis may generate a large number of files and for this reason, Melissa creates a new subdirectory of the current working directory for all of its outputs. By default, the directory name is the local time and date in ISO 8601 basic format, e.g., on December 7, 1999, at 12:34:56pm local time, the ISO 8601 timestamp is 19991207T123456.

The --output-dir option allows users to change the directory name. Furthermore, the argument value will be given to the standard C function strftime.

Examples:

  • --output-dir='..' makes Melissa write to the parent directory.
  • --output-dir='project-alpha' makes Melissa write to a directory called project-alpha.
  • --output-dir='project-beta-%T': output directory project-beta-19:12:00 (local time 7:12pm).

CAUTION:

  • Existing content in the output directory will be overwritten and/or erased.
  • Clock changes (e.g., from winter to summer time) may cause Melissa to generate the same output directory name twice.
  • Leap seconds may cause timestamps with "60" as the number of seconds (instead of only 00 to 59).
  • The single ticks in the examples above stop POSIX-compliant shells from modifying the argument value.

--scheduler-arg SCHEDULER_ARG

This option allows the user to pass arguments directly to the batch scheduler. This can be used, e.g., on a supercomputer to pass accounting information or select queues. For example, with the Slurm batch scheduler, the account can be selected as follows:

srun --account='melissa-devs' --ntasks=1 code-sol 3.14 159.0

Continuing the example, Melissa can be made to use this account as follows:

melissa-launcher --scheduler-arg=--account=melissa-devs slurm options.py code-sol

CAUTION: The Melissa command line parser cannot handle spaces in the argument value, e.g.,

  • --scheduler-arg '--account=melissa-devs' works whereas
  • --scheduler-arg='-A melissa-devs' does not work.

CAUTION: The user should not modify the number of processes (or tasks) using --scheduler-arg, instead, they should use --num-client-processes and --num-server-processes. Melissa needs to track the number of server and client processes to make proper resource allocation requests but since the launcher does not examine scheduler arguments, it cannot incorporate this data into its requests.

--scheduler-arg-client SCHEDULER_ARG_CLIENT

This option is identical to --scheduler-arg except that the argument values are only passed to the batch scheduler when launching clients.

--scheduler-arg-server SCHEDULER_ARG_SERVER

This option is identical to --scheduler-arg except that the argument values are only passed to the batch scheduler when launching servers.

--num-client-processes NUM_CLIENT_PROCESSES

The number of MPI processes of clients.

--num-server-processes NUM_SERVER_PROCESSES

The number of MPI processes of servers.

--with-simulation-setup

This option makes the Melissa launcher run the simulation once without MPI on the local computer before actually starting the simulation via the batch scheduler. This can useful, e.g., to set up directories or modify simulation input files.

Without this option, every MPI processes will start the simulation as follows (3.14, 159.0 are the input values passed by the launcher to the simulation):

mpirun -n 10 -- simulation 3.14 159.0

With this option, the simulation will be run twice and an additional argument will be passed on the command line. The first run is local without MPI:

simulation initialize 3.14 159.0

Only the second run invokes MPI (and may take place on another computer when working on computer clusters):

mpirun -n 10 -- simulation execute 3.14 159.0

Melissa Client API

The Melissa Client API provides functions for sending simulation data to a Melissa server for statistical analysis. The client API header file is melissa/api.h. the Melissa client code can be found in a library called melissa; link with -lmelissa.

Augmenting a Simulation

To make a simulation send data to the Melissa server the user should:

  • identify the fields (or quantities) that will be analyzed by Melissa,
  • identify where the quantities are stored in the simulation memory,
  • call MPI_Init() if needed,
  • call melissa_init() once for every quantity to be analyzed,
  • call melissa_send() once for every quantity to be analyzed and for every time step, and
  • call melissa_finalize() before terminating the simulation and/or calling MPI_Finalize().

Users should pay attention to the following items:

  • The list of quantities to be analyzed must be kept up-to-date inside the Melissa options file options.py.
  • A Melissa time step does not necessarily equal one simulation time step.
  • The discretization of the field to be analyzed must be constant. Melissa does not support variably discretized fields.
  • If Sobol' indices are to be computed, and if the MPI coupling shall be used, then the MPI communicator passed to Melissa must be simulation-specific, i.e., the communicator returned by MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_APPNUM, ...).

Client API

The most interesting headers for Melissa users are melissa/api.h and melissa/config.h. The latter file defines preprocessor macros for the Melissa version and the enabled features.

Troubleshooting

CMake does not find MPI with Intel Compilers 19

Error message:

-- Could NOT find MPI_C (missing: MPI_C_WORKS)

Solution: Make CMake invoke the Intel Compiler instead of GCC.

env CC=icc CXX=icpc cmake -- /path/to/melissa

License

Melissa is open source under the BSD 3-Clause License.

How to cite Melissa

Melissa: Large Scale In Transit Sensitivity Analysis Avoiding Intermediate Files. Théophile Terraz, Alejandro Ribes, Yvan Fournier, Bertrand Iooss, Bruno Raffin. The International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing), Nov 2017, Denver, United States. pp.1 - 14.

inproceedings{terraz:hal-01607479,
  TITLE = {{Melissa: Large Scale In Transit Sensitivity Analysis Avoiding Intermediate Files}},
  AUTHOR = {Terraz, Th{\'e}ophile and Ribes, Alejandro and Fournier, Yvan and Iooss, Bertrand and Raffin, Bruno},
  URL = {https://hal.inria.fr/hal-01607479},
  BOOKTITLE = {{The International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing)}},
  ADDRESS = {Denver, United States},
  PAGES = {1 - 14},
  YEAR = {2017},
  MONTH = Nov,
  KEYWORDS = {Sensitivity Analysis ; Multi-run Simulations ; Ensemble Simulation ; Sobol' Index ; In Transit Processing},
  PDF = {https://hal.inria.fr/hal-01607479/file/main-Sobol-SC-2017-HALVERSION.pdf},
  HAL_ID = {hal-01607479},
  HAL_VERSION = {v1},

Publications

  • Melissa: Large Scale In Transit Sensitivity Analysis Avoiding Intermediate Files. Théophile Terraz, Alejandro Ribes, Yvan Fournier, Bertrand Iooss, Bruno Raffin. The International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing), Nov 2017, Denver, United States. pp.1 - 14. PDF
  • The Challenges of In Situ Analysis for Multiple Simulations. Alejandro Ribés, Bruno Raffin. ISAV 2020 – In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, Nov 2020, Atlanta, United States. pp.1-6. (https://hal.inria.fr/hal-02968789)

Dependencies

Melissa would not exist without high-quality C compilers, Fortran compilers, Python interpreters, standard language libraries, build systems, development tools, text editors, command line tools, and Linux distributions. The Melissa developers want to thank all developers, maintainers, forum moderators and everybody else who helped to improve these pieces of software.

Melissa links against ØMQ (ZeroMQ) and because Melissa may be distributing in binary form (when static linking is enabled), we are obliged to mention that ZeroMQ is available under the terms of the GNU Lesser General Public License version 3 with static linking exception.

Copies of the licenses can be found in the folder licenses.

Development Hints

C and C++

C and C++ are easily susceptible to memory bugs and undefined behavior. For example, the following C99 code shows undefined behavior because the literal 1 is taken to be a signed integer by compiler:

#include <stdint.h>
uint32_t u = 1 << 31;

This is the corrected code:

#include <stdint.h>
uint32_t u = UINT32_C(1) << 31;

Many of these errors can be detected at compile-time if warnings are enabled and at run-time with the aid of the address sanitizer (ASAN) and the undefined behavior sanitizer (UBSAN). Both sanitizers are supported by GCC and Clang.

Most warnings are already enabled in CMakeLists.txt.

To enable the sanitizers, pass -fsanitize=address for ASAN and -fsanitize=undefined for UBSAN on the compiler command line. When using CMake, export the following environment flags before calling CMake:

export CFLAGS='-fsanitize=address -fsanitize=undefined'
export CXXFLAGS='-fsanitize=address -fsanitize=undefined'

Afterwards build the code as usual with make and make install. If an error is detected, the sanitizers print file and line data if debugging information is present in the executable files. This can be done either by adding the compiler flag -g to the compiler options or by settings -DCMAKE_BUILD_TYPE=Debug on the CMake command line.

As of August 2020, some Melissa tests seems to be leaking memory. To ignore memory leaks (and have ASAN only check for memory corruption), set the following environment flag before running the tests:

env ASAN_OPTIONS='leak_check_at_exit=0' ctest

A list of ASAN and UBSAN options is available at the linked websites.

Additionally, the standard memory allocator on Linux can be instructed to perform extra consistency checks by setting some environment flags:

export MALLOC_CHECK_=3
export MALLOC_PERTURB_=1

This approach works with any application and without any code modification.

MPI

MPI code may lead to false positives when checking for leaks with Valgrind or the address sanitizer. The address sanitizer can be instructed not to check for memory leaks on exit (update the environment variable ASAN_OPTIONS='leak_check_on_exit=0') and the Valgrind manual contains instructions for MPI applications (see §4.9 Debugging MPI Parallel Programs with Valgrind.

Open MPI is known to leak (usually) small amounts of statically allocated memory. For this reason recent Open MPI releases ship with a Valgrind suppression file, see the Open MPI FAQ 13. Is Open MPI 'Valgrind-clean' or how can I identify real errors?

ZeroMQ

Building ZeroMQ causes linker errors when the GNU ld options -z defs is used.