datapol

experiments-methodology



Experiments Methodology
This repository is the place to share experience, feedback, references, and basically
any thought about experiment methodology.

Reproducibility
To achieved (some kind of) reproducibility, it is important to understand
all the aspects of it.  A good starting point is the
series of webinars on Reproducible
Research organized by Arnaud
Legrand.

Experiments environment
Any part of the experiment workflow depends on tools that are need to be
preserved in order to be reproducible.

System image
In the team we have developed a tool call
Kameleon to generate compete operating system
(OS) imageqs that can be used as base for experiments environment.
The built images can be deployed on multiple nodes of a cluster using
Grid'5000
capabilities to deploy OS images as a user.

Packaging
There is several way to package application and more generally a software
stack or software environment that contains one or several application with
all the needed dependencies.

Containers
TODO: Explain Docker, rkt, LXC,...
TODO: Explain lightweight HPC containers (singularity, charlycloud,...)

Package manager
The classical Linux distribution package managers provides no garanties on
reproducibility: depending on the mirror grab the package from, the date of
installation, the last package upgrade, the same installation command would
not install the same software. Debian is providing a system of
snapshots that permits to get the exact same
version of a software if you know his installation date. This is not
sufficient to reproduce most of the software stack that are mainly custom
in scientific community and therefor not available on distribution's
mirrors. We use this snapshots in Kameleon recipes to provide reproducible
base OS system images.
There is more specific kind of package managers dedicated to
reproducibility.
Nix is a so-called Functional Package Manager that build packages in an
isolated environment. Those packages are written in the Nix expression
Domain specific language (DSL). Nix has no side effects, meaning that it
stores everything in the Nix store, /nix/store on your system, and only
provides soft links at installation to access whats inside the store. It
has a lot of good property like:

atomic upgrades and rollbacks
side-by-side installation of multiple versions of a package
user level package installation
multi-user package sharing
easy setup of build environments
Binary and source package

Here is a general introduction presentation of Nix and it ecosystem:

We also use Nix to create reproducible packaging
for the tools that are developed on the in the team. A repository is
available here.
Nix can be used to generate virtual environment to run experiments on.
For example it is possible to create a Nix profile that contains a complete
experiment environment, pack it into a tarball (called a closure in Nix
parlance), and install it on a each required node before running the
experiment. Here is an example of this kind of experiment's packaging:

Some HPC centers are providing Nix for he users. The CIMENT computation
center do provide this on the Froggy cluster. Here is a
documentation on
how to use Nix on the platform.
There also is a reimplementation of Nix using the Guile language called
GNU Guix. You can find more details
here. It base on the Nix
implementation and keeps the Nix ARchive format (NAR) for built packages.
The main differences with Nix are:

the rewrite of the CLI, which is more user friendly and provides
more features (like Security
Updates)
switch from Nix DSL to an emdeded DSL in the Guile language. The language
feature like modules removes a lot of boilerplates regarding Nix and make
the packages definitions more readable. Yet, the Guile language is not
widely used.
The number of contributors and therefor of packages in Guix is much
smaller.

It exists an Guix HPC initiative
that involves 3 HPC center in europe.
An other solution called Spack is made by the HPC
community to provide the ability to build an application with several
possible option: You can for example chose the compiler, the MPI library,
the specific feature you want to enable or disable etc.
Packages are written in full Python using the Spack libraries
In the same spirit, there is
EasyBuild but it is
retro-compatible with the venerable Environment
Module which is providing dynamic
modification of a user's environment in shell.
A new implementation of module is Lua called
Lmod can be use instead of Module i.e. this
paper

Workflow

Tools
List of tools that can be used to creates your own experimental workflow

Convention
A paper about workflow automation defining the popper
convention

Experiment narration
To keep track of what you've done, it is important to have a narration
coupled with the experiment workflow. Notebooks can contains this narration
and even the workflow itself thanks to literate
programming.
OrgMode is a very powerful tool to expose workflow
and the code in any language to execute this workflow.

Examples

Bebida
The Bebida repository contains a
set of experiments that runs on Grid'5000 using the popper convention (but
not the popper CLI tools). It also contains Kameleon recipes used to build
base images deployed on the experiments nodes on Grid'5000.
Multiple experiments are implemented in Python using
Execo to automate resource reservation, image
deployment, configuration and launch of the experiments.
Some Nix scripts are provided to generate virtual environment for data
analysis using Jupyter notebook with particular dependencies.