IO-SETS : Simple and efficient approaches for I/O bandwidth management
This repository contains the source codes, log files, and results of the paper entitled "IO-Sets:Simple and efficient approaches for I/O bandwidth management".
In this README we show the steps to reproduce the results presented in our manuscript. This file is organized as follows:
-
Getting Started presents the installation steps and setup of the simulation tool we used in this paper
-
Run simulations gives some examples of how to use these tool
-
Reproduce results shows how all simulation results can be reproduced
We also provide a DockerFile that can be used to reproduce our simulation tests quickly. If you want to reproduce the results using Docker, you can skip installation and execution procedures and directly refer to Reproduce simulation results using Docker.
This repository also contains extra results and graphs that are not presented in the submitted version of the article. In Extra Results and Graphs, we show how you can check out these results.
This software was partially supported by the EuroHPC-funded project ADMIRE (Project ID: 956748, https://www.admire-eurohpc.eu).
How to Cite This Material
If you find this material useful for your research or work, please consider citing it:
Francieli Boito, Guillaume Pallez, Luan Teylo, Nicolas Vidal. (2023). IO-SETS : Simple and efficient approaches for I/O bandwidth management. DOI: [10.5281/zenodo.8237993](https://doi.org/10.5281/zenodo.8237993)
Practical tests overview
Besides the simulation results, our manuscript also presents practical executions (Section VI-A). In Practical Experiments we provide some information about these executions.
You can find the submitted version of our paper in submitted.pdf
You can find the web supplementary material in web_supplementary.pdf
Getting Started
The codes were developed on a Debian-like system (Ubuntu 20.04 LTS). All instructions reported here are based on this system. Number versions are provided as an indication of the versions that were tested and used in this project.
Getting the Dependencies
- C++ compiler (g++ v.9.3.0)
- CMake (v3.16.3)
- Python 3 (v3.9)
- boost (v1.48)
- git
Recent versions of the dependencies can be installed as follows (in a shell):
apt install python3 python3-pip
apt install g++
apt install cmake
apt install libboost-dev libboost-context-dev
apt install git
Installing the project
This project contains the following tools:
- The simulation tool that implements the IO-Sets methodology, called simgio (
src/
andinclude/
) - A python package that facilitates the deployment and execution of numerous simulation tests, called pysimgio (
PYsimgio/
)
Simgio is executed on top of simgrid (v3.30) framework. For the tests presented in the manuscript the used commit on simgrid repository was 2618e575d819b0d0046c6c35a4ae96e94b61d0be
So first, install simgrid:
git clone https://framagit.org/simgrid/simgrid
cd simgrid/
git checkout 2618e575d819b0d0046c6c35a4ae96e94b61d0be
cmake -DCMAKE_INSTALL_PREFIX=/opt/simgrid .
make
make install
Second, install pysimgio
cd ../ # return to the root folder of the repository
pip install python-dateutil --upgrade
pip install setuptools
pip install -e .
pip install -r PYsimgio/requirement.txt
Third, compile simgio
mkdir build
cd build
cmake ../
make
By default, the executable file simgio are placed in bin/
Configurations
Before starting, it is necessary to configure the variable SIMGIO_HOME
in PYsimgio/evaluation.py
. The variable needs
to be set up to the actual root directory of the project.
That is the only parameter that needs to be defined.
class Evaluation:
# List of special character
COMMENT_CHAR = '#'
HOST = 'bob'
FUNCTION = 'host'
SIMGIO_HOME = '$HOME/iosets' # <-- THAT VARIABLE SHOULD CONTAIN THE FULL PATH TO THE SOURCE FOLDER OF THE PROJECT
CSV_FOLDER = 'csv'
XML_FOLDER = 'xml'
GRAPH_FOLDER = 'graphs'
Run simulations
simgio Example
Here, we give an example on how to run a simulation with simgio.
Simgio receives as input two XML files:
- the platform file;
- and the deployment file
The platform file describes the computational environment, while the deployment file describes the workload. An example
of these files with a complete description of each XML field is presented in example/deployment/simple_d.xml
and example/platform/host_with_disk.xml
.
A simulation with these files can be executed as follows:
$ cd bin/
$ ./simgio -p ../example/platform/host_with_disk.xml -d ../example/deployment/simple_d.xml -ts 1 -te 9 --log=simgio.thres:verbose
Note that the parameters ts and te (time frame begin and end) are mandatory. For more details about the time
frame see Section V-C in submitted.pdf
.
the --log parameter is optional and describes the output verbosity of simgrid (more info
in ./simgio ----help-logs
)
⚠️ There are two distinct help menus in simgio:
-
--help
calls simgrid help menu -
--h
calls simgio help menu.
PYsimgio Example
Pysimgio was initially developed to facilitate the execution of our test, but it evolved to become a fundamental part of simgio. It includes functions that allow us to generate a wide range of tests, handle simgio CSV outputs and generate graphs.
Running Pysimgio tests
All tests are written and executed using the classes PYsimgio.app_generator
and PYsimgio.evaluation
.
-
PYsimgio.evaluation
is responsible for generating the workloads and executing a sequence of tests considering, for example, different scheduling strategies. At the end of the execution, the class writes a CSV with the simulation output of each generated workload. -
PYsimgio.app_generator
is responsible for generating a job following the generation protocol described in Section _ V-B_ on thesubmitted.pdf
.
To illustrate pysimgio usage consider the example in example/pysimgio/simulation.py
, where Pysimgio.evaluation
class
is used to execute simulations with 20 distinct workloads composed of 60 jobs. At each execution, a workload with
different job distribution is created by calling the tasks_per_mu
function. Then, set_mapping
is used to create the
IO-Sets. Finally, the f_fair_share
and f_set_10
functions are used to define the set priorities
(in the case of f_fair_share
the priority is defined per job).
cd example/pysimgio/
python3 run_tests.py
At the end of the execution the following folders are created:
-
xml/
: to store all generated XML deployment files -
csv/
: to store all output CSV files
A Jupyter notebook presenting the results of this evaluation can be found in example/pysimgio/results.ipynb
This notebook also illustrates the use of some functions available in PYsimgio.simgio_utils.py
.
cd example/pysimgio/
jupyter notebook
Reproduce Simulation Results
You can use these commands to reproduce the simulation and the results reported in Section VI of our manuscript:
cd simulation_results/
sh run_all.sh
After the end of the execution, you can reproduce figures from 3 to 9 of the manuscript as follows:
cd simulation_results/plot
python3 plot00.py # Figure 4
python3 plot01.py ../heavy_io/60_apps/simulation_01/ 60 # Figures 5 and 6
python3 plot02.py ../heavy_io/60_apps/simulation_02/ 60 # Figure 7
python3 plot03.py ../heavy_io/60_apps/simulation_03/ 60 # Figure 8
python3 plot04.py ../heavy_io/60_apps/simulation_04/ 60 # Figure 9
⚠️ the computational platform used to execute the simulations is not relevant to reproduce the results reported in the manuscript. The computational platform only affects the time to run the simulations, which is not our study's focus.
Reproduce Simulation Results Using Docker
In the root directory of the project, we provide a Dockerfile
that creates a container with all the tools needed to reproduce the simulations of our manuscript.
For the next steps, we consider that you already have docker installed on your machine. If that is not the case, you can refer to https://docs.docker.com/engine/install/ to install the docker.
Before creating the container check if the variable SIMGIO_HOME
in PYsimgio/evaluation.py
is set up
to $HOME/iosets
(the default value). The simulation will not be executed inside the container if SIMGIO_HOME
is
pointing to another path.
First, on the root directory of the project, create a docker image
docker build -t iosets .
Second, execute the container in iterative mode
docker run -it iosets
Inside the container, you can use the following commands to reproduce all simulation results
cd simulation_results/
sh run_all.sh
The figures of the manuscript can also be generated inside the container
cd simulation_results/plot/
python3 plot00.py # Figure 4
python3 plot01.py ../heavy_io/60_apps/simulation_01/ 60 # Figures 5 and 6
python3 plot02.py ../heavy_io/60_apps/simulation_02/ 60 # Figure 7
python3 plot03.py ../heavy_io/60_apps/simulation_03/ 60 # Figure 8
python3 plot04.py ../heavy_io/60_apps/simulation_04/ 60 # Figure 9
Extra Results and Graphs
All folders with the simulation results contain a Jupyter notebook called results.ipynb
with extra graphs.
Jupyter is included in PYsimgio/requirements.txt
. Therefore, if you followed the installation steps, the jupyter
package should be already installed on your machine. Otherwise, you can install it as follows:
pip install jupyter
You can then open Jupyter's web interface, navigate among the folders and open the notebooks to check the extra graphs.
cd simulation_results/
jupyter notebook
Practical Experiments
In the folder realmachine_experiments/
you'll find all code used to generate the practical results and the results themselves presented in Section VI-A.
-
The
ior_schedclient
folder contains IOR 3.4.0 modified to talk to the scheduler using sockets (with TCP). It will receive as an argument the IP of the server. This implementation only works for periodic write phases (with no reading). It has been compiled and executed with OpenMPI 4.0.3. The new command-line options are -p for the IP of the scheduler and -P for the witer. An estimate of witer must be provided, because that information is used by the scheduler (it does not compute it automatically). -
The
scheduler
folder contains the scheduler. In the beginning of the .c we can change the scheduling algorithm and the verbosity level (debug as true or false). The scheduler must be running before starting the IOR instances. -
In the
run_scripts
folder, therun_with_scheduler
andrun_without_scheduler
scripts were used in slurm scripts to run the experiments. They callrun_concior_test.py
to start the concurrent IOR instances. All parameters used to call IOR are configured in the beginning of the python script. -
In the
parse_results
folder, we can find the python scripts use to calculate the reported metrics from the results (stretch, utilization, etc). -
The
results
folder contains all results, one test per folder. Inside each test folder, we can find the standard output of each IOR instance and of the scheduler. This was omitted for double blind review because the output mentions the folder where tests were executed, including the username.