Mentions légales du service

Skip to content
Snippets Groups Projects
user avatar
Luan Teylo authored
2075924b
History

IO-SETS : Simple and efficient approaches for I/O bandwidth management

This repository contains the source codes, log files, and results of the paper entitled "IO-Sets:Simple and efficient approaches for I/O bandwidth management".

In this README we show the steps to reproduce the results presented in our manuscript. This file is organized as follows:

We also provide a DockerFile that can be used to reproduce our simulation tests quickly. If you want to reproduce the results using Docker, you can skip installation and execution procedures and directly refer to Reproduce simulation results using Docker.

This repository also contains extra results and graphs that are not presented in the submitted version of the article. In Extra Results and Graphs, we show how you can check out these results.

This software was partially supported by the EuroHPC-funded project ADMIRE (Project ID: 956748, https://www.admire-eurohpc.eu).

How to Cite This Material

If you find this material useful for your research or work, please consider citing it:

Francieli Boito, Guillaume Pallez, Luan Teylo, Nicolas Vidal. (2023). IO-SETS : Simple and efficient approaches for I/O bandwidth management. DOI: [10.5281/zenodo.8237993](https://doi.org/10.5281/zenodo.8237993)

Practical tests overview

Besides the simulation results, our manuscript also presents practical executions (Section VI-A). In Practical Experiments we provide some information about these executions.


You can find the submitted version of our paper in submitted.pdf You can find the web supplementary material in web_supplementary.pdf

Getting Started

The codes were developed on a Debian-like system (Ubuntu 20.04 LTS). All instructions reported here are based on this system. Number versions are provided as an indication of the versions that were tested and used in this project.

Getting the Dependencies

  • C++ compiler (g++ v.9.3.0)
  • CMake (v3.16.3)
  • Python 3 (v3.9)
  • boost (v1.48)
  • git

Recent versions of the dependencies can be installed as follows (in a shell):

apt install python3 python3-pip
apt install g++ 
apt install cmake
apt install libboost-dev libboost-context-dev
apt install git 

Installing the project

This project contains the following tools:

  • The simulation tool that implements the IO-Sets methodology, called simgio (src/ and include/)
  • A python package that facilitates the deployment and execution of numerous simulation tests, called pysimgio (PYsimgio/)

Simgio is executed on top of simgrid (v3.30) framework. For the tests presented in the manuscript the used commit on simgrid repository was 2618e575d819b0d0046c6c35a4ae96e94b61d0be

So first, install simgrid:

git clone https://framagit.org/simgrid/simgrid
cd simgrid/
git checkout 2618e575d819b0d0046c6c35a4ae96e94b61d0be
cmake -DCMAKE_INSTALL_PREFIX=/opt/simgrid .
make
make install

Second, install pysimgio

cd ../ # return to the root folder of the repository
pip install python-dateutil --upgrade
pip install setuptools
pip install -e .
pip install -r PYsimgio/requirement.txt 

Third, compile simgio

mkdir build
cd build
cmake ../
make 

By default, the executable file simgio are placed in bin/

Configurations

Before starting, it is necessary to configure the variable SIMGIO_HOME in PYsimgio/evaluation.py. The variable needs to be set up to the actual root directory of the project.

That is the only parameter that needs to be defined.

class Evaluation:
    # List of special character
    COMMENT_CHAR = '#'
    HOST = 'bob'
    FUNCTION = 'host'

    SIMGIO_HOME = '$HOME/iosets'  # <-- THAT VARIABLE SHOULD CONTAIN THE FULL PATH TO THE SOURCE FOLDER OF THE PROJECT
    CSV_FOLDER = 'csv'
    XML_FOLDER = 'xml'
    GRAPH_FOLDER = 'graphs'

Run simulations

simgio Example

Here, we give an example on how to run a simulation with simgio.

Simgio receives as input two XML files:

  1. the platform file;
  2. and the deployment file

The platform file describes the computational environment, while the deployment file describes the workload. An example of these files with a complete description of each XML field is presented in example/deployment/simple_d.xml and example/platform/host_with_disk.xml.

A simulation with these files can be executed as follows:

$ cd bin/
$ ./simgio -p ../example/platform/host_with_disk.xml -d ../example/deployment/simple_d.xml -ts 1  -te 9  --log=simgio.thres:verbose

Note that the parameters ts and te (time frame begin and end) are mandatory. For more details about the time frame see Section V-C in submitted.pdf.

the --log parameter is optional and describes the output verbosity of simgrid (more info in ./simgio ----help-logs)

⚠️ There are two distinct help menus in simgio:

  • --help calls simgrid help menu
  • --h calls simgio help menu.

PYsimgio Example

Pysimgio was initially developed to facilitate the execution of our test, but it evolved to become a fundamental part of simgio. It includes functions that allow us to generate a wide range of tests, handle simgio CSV outputs and generate graphs.

Running Pysimgio tests

All tests are written and executed using the classes PYsimgio.app_generator and PYsimgio.evaluation.

  • PYsimgio.evaluation is responsible for generating the workloads and executing a sequence of tests considering, for example, different scheduling strategies. At the end of the execution, the class writes a CSV with the simulation output of each generated workload.

  • PYsimgio.app_generator is responsible for generating a job following the generation protocol described in Section _ V-B_ on the submitted.pdf.

To illustrate pysimgio usage consider the example in example/pysimgio/simulation.py, where Pysimgio.evaluation class is used to execute simulations with 20 distinct workloads composed of 60 jobs. At each execution, a workload with different job distribution is created by calling the tasks_per_mu function. Then, set_mapping is used to create the IO-Sets. Finally, the f_fair_share and f_set_10 functions are used to define the set priorities (in the case of f_fair_share the priority is defined per job).

cd example/pysimgio/
python3 run_tests.py 

At the end of the execution the following folders are created:

  • xml/ : to store all generated XML deployment files
  • csv/: to store all output CSV files

A Jupyter notebook presenting the results of this evaluation can be found in example/pysimgio/results.ipynb This notebook also illustrates the use of some functions available in PYsimgio.simgio_utils.py.

cd example/pysimgio/
jupyter notebook

Reproduce Simulation Results

You can use these commands to reproduce the simulation and the results reported in Section VI of our manuscript:

cd simulation_results/
sh run_all.sh

After the end of the execution, you can reproduce figures from 3 to 9 of the manuscript as follows:

cd simulation_results/plot
python3 plot00.py # Figure 4
python3 plot01.py ../heavy_io/60_apps/simulation_01/ 60 # Figures 5 and 6
python3 plot02.py ../heavy_io/60_apps/simulation_02/ 60 # Figure 7
python3 plot03.py ../heavy_io/60_apps/simulation_03/ 60 # Figure 8
python3 plot04.py ../heavy_io/60_apps/simulation_04/ 60 # Figure 9

⚠️ the computational platform used to execute the simulations is not relevant to reproduce the results reported in the manuscript. The computational platform only affects the time to run the simulations, which is not our study's focus.

Reproduce Simulation Results Using Docker

In the root directory of the project, we provide a Dockerfile that creates a container with all the tools needed to reproduce the simulations of our manuscript.

For the next steps, we consider that you already have docker installed on your machine. If that is not the case, you can refer to https://docs.docker.com/engine/install/ to install the docker.

Before creating the container check if the variable SIMGIO_HOME in PYsimgio/evaluation.py is set up to $HOME/iosets (the default value). The simulation will not be executed inside the container if SIMGIO_HOME is pointing to another path.

First, on the root directory of the project, create a docker image

docker build -t iosets . 

Second, execute the container in iterative mode

docker run -it iosets

Inside the container, you can use the following commands to reproduce all simulation results

cd simulation_results/
sh run_all.sh 

The figures of the manuscript can also be generated inside the container

cd simulation_results/plot/
python3 plot00.py # Figure 4
python3 plot01.py ../heavy_io/60_apps/simulation_01/ 60 # Figures 5 and 6
python3 plot02.py ../heavy_io/60_apps/simulation_02/ 60 # Figure 7
python3 plot03.py ../heavy_io/60_apps/simulation_03/ 60 # Figure 8
python3 plot04.py ../heavy_io/60_apps/simulation_04/ 60 # Figure 9

Extra Results and Graphs

All folders with the simulation results contain a Jupyter notebook called results.ipynb with extra graphs.

Jupyter is included in PYsimgio/requirements.txt. Therefore, if you followed the installation steps, the jupyter package should be already installed on your machine. Otherwise, you can install it as follows:

pip install jupyter

You can then open Jupyter's web interface, navigate among the folders and open the notebooks to check the extra graphs.

cd simulation_results/
jupyter notebook

Practical Experiments

In the folder realmachine_experiments/ you'll find all code used to generate the practical results and the results themselves presented in Section VI-A.

  • The ior_schedclient folder contains IOR 3.4.0 modified to talk to the scheduler using sockets (with TCP). It will receive as an argument the IP of the server. This implementation only works for periodic write phases (with no reading). It has been compiled and executed with OpenMPI 4.0.3. The new command-line options are -p for the IP of the scheduler and -P for the witer. An estimate of witer must be provided, because that information is used by the scheduler (it does not compute it automatically).

  • The scheduler folder contains the scheduler. In the beginning of the .c we can change the scheduling algorithm and the verbosity level (debug as true or false). The scheduler must be running before starting the IOR instances.

  • In the run_scripts folder, the run_with_scheduler and run_without_scheduler scripts were used in slurm scripts to run the experiments. They call run_concior_test.py to start the concurrent IOR instances. All parameters used to call IOR are configured in the beginning of the python script.

  • In the parse_results folder, we can find the python scripts use to calculate the reported metrics from the results (stretch, utilization, etc).

  • The results folder contains all results, one test per folder. Inside each test folder, we can find the standard output of each IOR instance and of the scheduler. This was omitted for double blind review because the output mentions the folder where tests were executed, including the username.