diff --git a/Doc/noDist/implicit/figure/FMM.png b/Doc/noDist/implicit/figure/FMM.png new file mode 100644 index 0000000000000000000000000000000000000000..b3a76f3feec7324de69c0919d9f58ab1cb829320 Binary files /dev/null and b/Doc/noDist/implicit/figure/FMM.png differ diff --git a/Doc/noDist/implicit/figure/blocked.png b/Doc/noDist/implicit/figure/blocked.png new file mode 100644 index 0000000000000000000000000000000000000000..4be11fb6415c91d2f8bd6e9a4b16250e06bb5c10 Binary files /dev/null and b/Doc/noDist/implicit/figure/blocked.png differ diff --git a/Doc/noDist/implicit/figure/ellipsedistribution.png b/Doc/noDist/implicit/figure/ellipsedistribution.png new file mode 100644 index 0000000000000000000000000000000000000000..25d2b0bb15dd443a0926dc5f006101d9a61c3e1a Binary files /dev/null and b/Doc/noDist/implicit/figure/ellipsedistribution.png differ diff --git a/Doc/noDist/implicit/figure/octree.png b/Doc/noDist/implicit/figure/octree.png new file mode 100644 index 0000000000000000000000000000000000000000..f071d5e95cf5c0e21256609796760f77d8ee1e2b Binary files /dev/null and b/Doc/noDist/implicit/figure/octree.png differ diff --git a/Doc/noDist/implicit/figure/uniformdistribution.png b/Doc/noDist/implicit/figure/uniformdistribution.png new file mode 100644 index 0000000000000000000000000000000000000000..85185c4d9b62f47311ca8426899f8f0b2d7100e6 Binary files /dev/null and b/Doc/noDist/implicit/figure/uniformdistribution.png differ diff --git a/Doc/noDist/implicit/implicit.org b/Doc/noDist/implicit/implicit.org index 7ac98572fc1d2f8123f175dd205962f4cd6b5aff..2a2d76413f9a1ae0cc727671afd80194a8edd9b1 100644 --- a/Doc/noDist/implicit/implicit.org +++ b/Doc/noDist/implicit/implicit.org @@ -8,18 +8,12 @@ #+EXPORT_EXCLUDE_TAGS: noexport #+TAGS: noexport(n) - -# #+BEGIN_SRC sh -# export SCALFMM_DIR=/home/mkhannou/scalfmm -# cd $SCALFMM_DIR -# git checkout mpi_implicit -# spack install scalfmm@src+mpi+starpu \^starpu@svn-trunk+mpi+fxt \^openmpi -# #+END_SRC - * Abstract -We live in a world were computer capacity get larger and larger, unfortunatly, our old algorithm ain't calibrate for such computer so it is important to find new paradigme to use the full power of those newest machine and then go faster than ever. - The Fast Multipole Methode (FMM) is one of the most prominent algorithms to perform pair-wise particle interactions, with application in many physical problems such as astrophysical simulation, molecular dynmics, the boundary element method, radiosity in computer-graphics or dislocation dynamics. Its linear complexity makes it an excellent candidate for large-scale simulation. + The following paper/website/file aim to decribe how to switch from an + explicit starpu mpi code to an implicit starpu mpi code. It describe also + the methodology used to compare mpi algorithms on scalfmm and the result obtained. + * Introduction ** N-Body problem <<sec:nbody_problem>> @@ -42,16 +36,24 @@ The FMM is used to solve a variety of problems: astrophysical simulations, molec NOTE: Directly come from Bérenger thesis. *** Algorithm -The FMM algorithm rely on an octree (quadtree in 2 dimensions) obtained by splitting the space of the simulation recursivly in 8 parts (4 parts in 2D). The building is shown on figure [[fig:octree]]. +The FMM algorithm rely on an octree (quadtree in 2D) obtained by splitting the space of the simulation recursivly in 8 parts (4 parts in 2D). The building is shown on figure [[fig:octree]]. -#+CAPTION: On the left side is the box with all the particles. In the middle is the same box as before, split three time in four parts. Whiche give 64 smaller boxes in the end. On the right is the quadtree (octree in 3D) with an height of 4 built from the different splitting. On top of it is the root of the tree that hold the whole box and so all particles. +#+CAPTION: 2D space decomposition (Quadtree). Grid view and hierarchical view. #+name: fig:octree [[./figure/octree.png]] -#+CAPTION: FMM algorithm. +#+CAPTION: Different stepes of the FMM algorithm; upward pass (left), transfer pass and direct step (center), and downward pass (right). +#+name: fig:algorithm [[./figure/FMM.png]] +The algorithm is illustraded on figure [[fig:algorithm]]. It first compute +the P2M operator to approximate particle interactions in the multipole. +Then the M2M operator is applied between each level to approximate +interactions to the next level. Then M2L and P2P are done between neighboors. +Finnaly, the L2L operator apply approximations to the level below and the L2P +will apply the approximations of the last level to the particles. + * State of the art Nothing for now ... ** Task based FMM @@ -77,20 +79,93 @@ In that way, the main algorithm remain almost as simple as the sequential one an It also create a DAG from which interesting property can be used to prove interesting stuff. (Not really sure) *** Group tree -What is a group tree and what is it doing here ? -The task scheduling with a smart runtime such as StarPU cost a lot. -The FMM generate a huge amount of small tasks, which considerably increase the time spent into the scheduler. -The group tree pack particule or multipole together into a group and execute a task (P2P, P2M, M2M, ...) on a group of particles (or multipole) rather than only one particle (or multipole). -This granularity of the task and make the cost of the scheduler decrease. + A group tree is like the original octree (or quadtree in 2D) where cells and + particles are packed together in new cells and new "particles". Tasks are + then executed ont those groups rather than only on one particles (or + multipole). + + The group tree was introduced because the original algorithm generated too + much small tasks and the time spent in the runtime was to high. With the + group tree, task got big enough so the runtime time got neglectable again. + + A group tree is built following a simple rule. Given a group size, + particles (or multipoles) following the Morton index are grouped together + regardless their parents or children. -TODO: image of the group tree +#+CAPTION: A quadtree and the correspondig group tree with Ng=3. +#+name: fig:grouptree +[[./figure/blocked.png]] +*** Scalfmm *** Distributed FMM * Implicit MPI FMM ** Sequential Task Flow with implicit communication -Two differents things : -- Register data handle in starpu_mpi -- Define a data mapping function so each handle will be placed ton an mpi node. + There is very few difference between the STF and implicite MPI STF. + +*** Init + The first difference between a simple StarPU algorithm and a StarPU + MPI implicit is the call to /starpu_mpi_init/ right after /starpu_init/ + and a call to /starpu_mpi_shutdown/ right before /starpu_shutdown/. + + The call to /starpu_mpi_init/ looks like : +#+begin_src src c +starpu_mpi_init(argc, argv, initialize_mpi) +#+end_src + /initialize_mpi/ should be set to 0 if a call to /MPI_Init/ (or + /MPI_Init_thread/) has already be + made. + +*** Data handle + The second difference is the way StarPU handle are registered. + There is still the classical call to /starpu_variable_data_register/ so + StarPU know the data but it also need a call to /starpu_mpi_data_register/. + + The call looks like this: +#+begin_src src c +starpu_mpi_data_register(starpu_handle, tag, mpi_rank) +#+end_src + /starpu_handle/ : is a the handle used by StarPU to work with data. + + /tag/ : is the MPI tag which need to be different for each + handle, but must correspond to the same handle among all MPI node. + + /mpi_rank/ : correspond to the MPI node on which the data will be stored. + + Note that, when an handle is registered on a node different from the + current node the call to /starpu_variable_data_register/ should looks like : +#+begin_src src c +starpu_variable_data_register(starpu_handle, -1, buffer, buffer_size); +#+end_src + The -1 specify that the data is not stored in the main memory and in this + case, it is stored on another node. + + At the end of the application, handles should be unregister with /starpu_data_unregister/ only if it + were registered on the node. + +*** Data mapping function + The last difference and probably the most interesting one is the data + mapping function. This function must return the node on which the data + will be mapped given information about the data. + + For now, in Scalfmm, it use the level in the octree and the Morton index + inthis level. But it could be anything, like external + information previously compute by another sofware. + + For now, here is the data mapping function: +#+begin_src src c +int dataMappingBerenger(MortonIndex const idx, int const idxLevel) const { + for(int i = 0; i < nproc; ++i) + if(nodeRepartition[idxLevel][i][0] <= nodeRepartition[idxLevel][i][1] && idx >= nodeRepartition[idxLevel][i][0] && idx <= nodeRepartition[idxLevel][i][1]) + return i; + if(mpi_rank == 0) + cout << "[scalfmm][map error] idx " << idx << " on level " << idxLevel << " isn't mapped on any proccess." << endl; + return -1; +} +#+end_src + + /nodeRepartition/ is an array which describe at each level the working + interval per node. + ** Data Mapping One of the main advantage of using implicit mpi communication in starpu is that tha data mapping can be separated from the algorithm. It is then possible to change the data mapping without changing the algorithm. @@ -110,124 +185,90 @@ This tool could be used to force certain data mapping in the implicit mpi versio ** Result *** Hardware -One node equal 2 Dodeca-core Haswell Intel® Xeon® E5-2680, 2,5GHz, 128Go de RAM (DDR4 2133MHz), 500 Go de stockage (Sata). +One node has 2 Dodeca-core Haswell Intel® Xeon® E5-2680, 2,5GHz, 128Go de RAM (DDR4 2133MHz), 500 Go de stockage (Sata). *** Aims -Compare explicit and implicit version. -But to measure the impact of implicit communication we need an implicit version as close to the explicit version as possible. -Mainly, this means, same particules onto the same grouped tree with same task executed on the same node. + The aims is to compare explicit and implicit version as well as any other mpi + version or mpi data mapping. + But to measure the impact of implicit communication we need an implicit version as close to the explicit version as possible. + Mainly, this means, same particules into the same group tree with same task executed on the same node. + + All algorithm study two different particle disposition, a uniform cube + and an ellipsoid. + Both looks like figure [[fig:ellipsedistribution]] and + [[fig:uniformdistribution]]. + + +#+CAPTION: cube (volume). +#+name: fig:uniform +[[./figure/uniformdistribution.png]] + +#+CAPTION: Ellipsoid (surface). +#+name: fig:ellipse +[[./figure/ellipsedistribution.png]] + + The point of working on the uniform cube is to validate algorithms on a + simple case. It also allows to check for any performance regression. The + ellipsoid (which is a surface and not a volume) is a more challenging + particle set because it generates an unbalenced tree. It is used to see + if an algorithm is better than another. + +*** Description of the plots +**** Time + The time plots displays the time spent into each part of the execution. + It is useful to diagnose what take the most time in a run. +**** Parallel efficiency + The parallel efficiency plots displays how faster is an algorithm + compare to it's one node version. +**** Normalized time + The normalized time plot shows the speedup compare to a one node + algorithm. Which is the StarPU algorithm without any MPI communication + in it. +**** Efficiency + Not sure yet +**** Speedup + The speedup plot shows how faster is the algorithm compare a reference + algorithm. + The explicit algorithm was used as a reference. It was choosen instead + of the StarPU algorithm, because the comparison was done for each number + of node and the StarPU algorithm (without any MPI communication) only + run on one node. +*** Measurement + To compute the execution time and make sure each algorithm has the way to + compute it we do it like the following: + +#+begin_src src C +mpiComm.global().barrier(); +timer.tic(); +groupalgo.execute(); +mpiComm.global().barrier(); +timer.tac(); +#+end_src + + With a barrier before starting the measurement. A barrier at the end, + before stop the measurement. + What is measured corresponds to the time of one iteration of the + algorithm without the time of object creation nor pre-computation of the + kernel. + + There is still a tiny exception, for the StarPU algorithm + (which is the StarPU version without MPI) because this algortihm always run + on one node, there is no need to add MPI barrier to correctly measure its + execution time. + + *** Scripts and jobs <<sec:result>> The scripts of the jobs: -#+BEGIN_SRC -#!/usr/bin/env bash -## name of job -#SBATCH -J chebyshev_50M_10_node -#SBATCH -p longq -## Resources: (nodes, procs, tasks, walltime, ... etc) -#SBATCH -N 10 -#SBATCH -c 24 -# # standard output message -#SBATCH -o chebyshev_50M_10_node%j.out -# # output error message -#SBATCH -e chebyshev_50M_10_node%j.err -#SBATCH --mail-type=ALL --mail-user=martin.khannouz@inria.fr -module purge -module load slurm -module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 -. /home/mkhannou/spack/share/spack/setup-env.sh -spack load fftw -spack load hwloc -spack load openmpi -spack load starpu@svn-trunk+fxt -## modules to load for the job -export GROUP_SIZE=500 -export TREE_HEIGHT=8 -export NB_NODE=$SLURM_JOB_NUM_NODES -export STARPU_NCPU=24 -export NB_PARTICLE_PER_NODE=5000000 -export STARPU_FXT_PREFIX=`pwd`/ -echo "=====my job informations ====" -echo "Node List: " $SLURM_NODELIST -echo "my jobID: " $SLURM_JOB_ID -echo "Nb node: " $NB_NODE -echo "Particle per node: " $NB_PARTICLE_PER_NODE -echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) -echo "In the directory: `pwd`" -rm -f canard.fma > /dev/null 2>&1 -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average -#TODO probably move trace.rec somewhere else ... -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -f canard.fma -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average -#+END_SRC - -and - -#+BEGIN_SRC -#!/usr/bin/env bash -## name of job -#SBATCH -J chebyshev_50M_1_node -#SBATCH -p longq -## Resources: (nodes, procs, tasks, walltime, ... etc) -#SBATCH -N 1 -#SBATCH -c 24 -# # standard output message -#SBATCH -o chebyshev_50M_1_node%j.out -# # output error message -#SBATCH -e chebyshev_50M_1_node%j.err -#SBATCH --mail-type=ALL --mail-user=martin.khannouz@inria.fr -module purge -module load slurm -module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 -. /home/mkhannou/spack/share/spack/setup-env.sh -spack load fftw -spack load hwloc -spack load openmpi -spack load starpu@svn-trunk+fxt -## modules to load for the job -export GROUP_SIZE=500 -export TREE_HEIGHT=8 -export NB_NODE=$SLURM_JOB_NUM_NODES -export STARPU_NCPU=24 -export NB_PARTICLE_PER_NODE=50000000 -export STARPU_FXT_PREFIX=`pwd`/ -echo "=====my job informations ====" -echo "Node List: " $SLURM_NODELIST -echo "my jobID: " $SLURM_JOB_ID -echo "Nb node: " $NB_NODE -echo "Particle per node: " $NB_PARTICLE_PER_NODE -echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) -echo "In the directory: `pwd`" -rm -f canard.fma > /dev/null 2>&1 -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Kernel -#+END_SRC -The result given by the script after few minutes executing: +#+include: "~/scalfmm/jobs/starpu_chebyshev.sh" src sh -Result for 10 nodes. -#+BEGIN_EXAMPLE -=====my job informations ==== -Node List: miriel[022-031] -my jobID: 109736 -Nb node: 10 -Particle per node: 5000000 -Total particles: 50000000 -In the directory: /home/mkhannou/scalfmm -Average time per node (explicit Cheby) : 9.35586s -Average time per node (implicit Cheby) : 10.3728s -#+END_EXAMPLE - -Result for 1 node. -#+BEGIN_EXAMPLE -=====my job informations ==== -Node List: miriel036 -my jobID: 109737 -Nb node: 1 -Particle per node: 50000000 -Total particles: 50000000 -In the directory: /home/mkhannou/scalfmm -Kernel executed in in 62.0651s -#+END_EXAMPLE + The results are stored into one directories at ~/scalfmm/jobs_results on + plafrim. They need to be downloaded and aggregated. + The work is done by the two following scripts. All results are aggregated + into a single csv file which is used by R scripts to generated plots. -As you can see, on only one node, it took a little more than one minutes to run the algorithm. It took only 10 seconds and 14 seconds for the explicit and implicit version. +#+include: "~/suricate.sh" src sh +#+include: "~/scalfmm/Utils/benchmark/loutre.py" src python * Notes ** Installing @@ -302,7 +343,7 @@ ssh mkhannou@plafrim "/home/mkhannou/spack/bin/spack mirror add local_filesystem ssh mkhannou@plafrim '/home/mkhannou/spack/bin/spack install starpu@svn-trunk+mpi+fxt \^openmpi' #+end_src -TODO add script I add on plafrim side with library links. + TODO add script I add on plafrim side with library links. *** Execute on plafrim To run my tests on plafrim, I used the two following scripts. @@ -325,28 +366,26 @@ export LIBRARY_PATH=/usr/lib64:$LIBRARY_PATH export SPACK_ROOT=$HOME/spack . $SPACK_ROOT/share/spack/setup-env.sh -#Load dependencies for starpu and scalfmm spack load fftw spack load hwloc spack load openmpi -spack load starpu@svn-trunk~fxt +spack load starpu@svn-trunk+fxt cd scalfmm/Build -#Configure and build scalfmm and scalfmm tests rm -rf CMakeCache.txt CMakeFiles > /dev/null -cmake .. -DSCALFMM_USE_MPI=ON -DSCALFMM_USE_STARPU=ON -DSCALFMM_USE_FFT=ON -DSCALFMM_BUILD_EXAMPLES=ON -DSCALFMM_BUILD_TESTS=ON -DCMAKE_CXX_COMPILER=`which g++` +cmake .. -DSCALFMM_USE_MPI=ON -DSCALFMM_USE_STARPU=ON -DSCALFMM_USE_FFT=ON -DSCALFMM_BUILD_EXAMPLES=ON -DSCALFMM_BUILD_TESTS=ON -DCMAKE_CXX_COMPILER=`which g++` make clean -make -j `nproc` +make testBlockedChebyshev testBlockedImplicitChebyshev testBlockedMpiChebyshev testBlockedImplicitAlgorithm testBlockedMpiAlgorithm -#Submit jobs cd .. -files=./jobs/*.sh +files=./jobs/*.sh +mkdir jobs_result for f in $files do echo "Submit $f..." sbatch $f + if [ "$?" != "0" ] ; then - echo "Error submitting $f." break; fi done @@ -356,14 +395,8 @@ done A good place I found to put your orgmode file and its html part is on the inria forge, in your project repository. For me it was the path /home/groups/scalfmm/htdocs. So I created a directory named orgmode and create the following script to update the files. -#+begin_src sh -cd Doc/noDist/implicit -emacs implicit.org --batch -f org-html-export-to-html --kill -ssh scm.gforge.inria.fr "cd /home/groups/scalfmm/htdocs/orgmode/; rm -rf implicit" -cd .. -scp -r implicit scm.gforge.inria.fr:/home/groups/scalfmm/htdocs/orgmode/ -ssh scm.gforge.inria.fr "cd /home/groups/scalfmm/htdocs/orgmode/; chmod og+r implicit -R;" -#+end_src + +#+include: "~/scalfmm/export_orgmode.sh" src sh * Journal @@ -499,6 +532,11 @@ Mais c'est données n'impliquent pas de forcément des transitions de données m - Modifier les jobs pour qu'il utilisent la même graine et générent le même ensemble de particules - Post traiter les traces d'une exécution pour créer des graphiques. - Exploiter les scripts de Samuel +- Création du "pipeline" pour générer les graphiques + - Script pour envoyer sur plafrim + - Script pour soummettre tous les jobs + - Script pour aggréger les résultats et générer les graphiques + - Test du pipeline (un peu lent) ** Et après ? diff --git a/Utils/loutre.py b/Utils/loutre.py deleted file mode 100755 index 63e82056bc74b3c14b88cf3a54e8e854b0d8b61d..0000000000000000000000000000000000000000 --- a/Utils/loutre.py +++ /dev/null @@ -1,204 +0,0 @@ -#!/usr/bin/python -import getopt -import sys -import math -import copy -import os -import socket -import subprocess -import re -import types - -class ScalFMMConfig(object): - num_threads = 1 - num_nodes = 1 - algorithm = "implicit" - model = "cube" - num_particules = 10000 - height = 4 - bloc_size = 100 - order = 5 - - def show(self): - print ("=== Simulation parameters ===") - print ("Number of nodes: " + str(self.num_nodes)) - print ("Number of threads: " + str(self.num_threads)) - print ("Model: " + str(self.model)) - print ("Number of particules: " + str(self.num_particules)) - print ("Height: " + str(self.height)) - print ("Bloc size: " + str(self.bloc_size)) - print ("Order: " + str(self.order)) - - def gen_header(self): - columns = [ - "model", - "algo", - "nnode", - "nthreads", - "npart", - "height", - "bsize", - "global_time", - "runtime_time", - "task_time", - "idle_time", - "scheduling_time", - "communication_time", - "rmem", - ] - header = "" - for i in range(len(columns)): - if not i == 0: - header += "," - header += "\"" + columns[i] + "\"" - header += "\n" - return header - - - def gen_record(self, global_time, runtime_time, task_time, idle_time, scheduling_time, rmem): - columns = [ - self.model, - self.algorithm, - self.num_nodes, - self.num_threads, - self.num_particules, - self.height, - self.bloc_size, - global_time, - runtime_time, - task_time, - idle_time, - scheduling_time, - 0.0, - rmem, - ] - record = "" - for i in range(len(columns)): - if not i == 0: - record += "," - if (type(columns[i]) is bool or - type(columns[i]) == str): - record += "\"" - record += str(columns[i]) - if (type(columns[i]) == bool or - type(columns[i]) == str): - record += "\"" - record += "\n" - return record - -def get_times_from_trace_file(filename): - cmd = "starpu_trace_state_stats.py " + filename - proc = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) - stdout, stderr = proc.communicate() - if not proc.returncode == 0: - sys.exit("FATAL: Failed to parse trace.rec!") - return proc.returncode - task_time = 0.0 - idle_time = 0.0 - runtime_time = 0.0 - scheduling_time = 0.0 - for line in stdout.decode().splitlines(): - arr = line.replace("\"", "").split(",") - if arr[0] == "Name": - continue - if len(arr) >= 4: - if arr[2] == "Runtime": - if arr[0] == "Scheduling": - scheduling_time = float(arr[3]) - else: - runtime_time = float(arr[3]) - elif arr[2] == "Task": - task_time += float(arr[3]) - elif arr[2] == "Other": - idle_time = float(arr[3]) - # sys.exit("Invalid time!") - return runtime_time, task_time, idle_time, scheduling_time - -def main(): - output_trace_file="" - trace_filename="trace.rec" - output_filename="loutre.db" - - long_opts = ["help", - "trace-file=", - "output-trace-file=", - "output-file="] - - opts, args = getopt.getopt(sys.argv[1:], "ht:i:o:", long_opts) - for o, a in opts: - if o in ("-h", "--help"): - # usage() - print("No help") - sys.exit() - elif o in ("-t", "--trace-file"): - trace_filename = str(a) - elif o in ("-i", "--output-trace-file"): - output_trace_file = str(a) - elif o in ("-o", "--output-file"): - output_filename = str(a) - else: - assert False, "unhandled option" - - config=ScalFMMConfig() - rmem = 0 - global_time = 0.0 - runtime_time = 0.0 - task_time = 0.0 - idle_time = 0.0 - scheduling_time = 0.0 - - if (os.path.isfile(output_filename)): #Time in milli - output_file = open(output_filename, "a") - else: - output_file = open(output_filename, "w") - output_file.write(config.gen_header()) - - with open(output_trace_file, "r") as ins: - for line in ins: - if re.search("Average", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - global_time = a[0] - elif re.search("Total Particles", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.num_particules = int(a[0]) - elif re.search("Total Particles", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.num_particules = int(a[0]) - elif re.search("Group size", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.bloc_size = int(a[0]) - elif re.search("Nb node", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.num_nodes = int(a[0]) - elif re.search("Tree height", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.height = int(a[0]) - elif re.search("Nb thread", line): - a = re.findall("[-+]?\d*\.\d+|\d+", line) - if len(a) == 1: - config.num_threads = int(a[0]) - elif re.search("Model", line): - config.model = line[line.index(":")+1:].strip() - elif re.search("Algorithm", line): - config.algorithm = line[line.index(":")+1:].strip() - - if (os.path.isfile(trace_filename)): #Time in milli - runtime_time, task_time, idle_time, scheduling_time = get_times_from_trace_file(trace_filename) - else: - print("File doesn't exist " + trace_filename) - - # Write a record to the output file. - output_file.write(config.gen_record(float(global_time), - float(runtime_time), - float(task_time), - float(idle_time), - float(scheduling_time), - int(rmem))) - -main() diff --git a/export_orgmode.sh b/export_orgmode.sh new file mode 100755 index 0000000000000000000000000000000000000000..66c2288898a40d854c9245561efcaca1db72fc64 --- /dev/null +++ b/export_orgmode.sh @@ -0,0 +1,7 @@ +#!/bin/bash +cd /home/mkhannou/scalfmm/Doc/noDist/implicit +emacs implicit.org --batch -f org-html-export-to-html --kill +ssh scm.gforge.inria.fr "cd /home/groups/scalfmm/htdocs/orgmode/; rm -rf implicit" +cd .. +scp -r implicit scm.gforge.inria.fr:/home/groups/scalfmm/htdocs/orgmode/ +ssh scm.gforge.inria.fr "cd /home/groups/scalfmm/htdocs/orgmode/; chmod og+r implicit -R;" diff --git a/jobs/explicit_10N_chebyshev.sh b/jobs/explicit_10N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..f2296180dd2ee58b87b3cfaee941f139f0f4c9fe --- /dev/null +++ b/jobs/explicit_10N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J explicit_50M_10N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 10 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e explicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=5000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: explicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/explicit_1N_chebyshev.sh b/jobs/explicit_1N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..6d2e0f94ee7db5f4396a14c1896e9e6f52af3338 --- /dev/null +++ b/jobs/explicit_1N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J explicit_50M_1N +#SBATCH -p defq +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 1 +#SBATCH -c 24 +#SBATCH --time=02:00:00 +# # output error message +#SBATCH -e explicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=50000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: explicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/explicit_2N_chebyshev.sh b/jobs/explicit_2N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..93b26ae788c1d13f011f78975f94df39591ed6e5 --- /dev/null +++ b/jobs/explicit_2N_chebyshev.sh @@ -0,0 +1,54 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J explicit_50M_2N +#SBATCH -p court +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 2 +#SBATCH -c 24 +#SBATCH --time=04:00:00 +# # output error message +#SBATCH -e explicit_50M_2N_%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=25000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: explicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result + diff --git a/jobs/explicit_4N_chebyshev.sh b/jobs/explicit_4N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..d911cd7947a5ceecceb3e8424247b8f949e1d105 --- /dev/null +++ b/jobs/explicit_4N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J explicit_50M_4N +#SBATCH -p court +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 4 +#SBATCH -c 24 +#SBATCH --time=04:00:00 +# # output error message +#SBATCH -e explicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=12500000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: explicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/explicit_8N_chebyshev.sh b/jobs/explicit_8N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..033cfd5520265212db3fcb9dace6b045429a9c7b --- /dev/null +++ b/jobs/explicit_8N_chebyshev.sh @@ -0,0 +1,54 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J explicit_50M_8N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 8 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e explicit_50M_8N%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=6250000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: explicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result + diff --git a/jobs/explicit_mpi_chebyshev.sh b/jobs/explicit_mpi_chebyshev.sh deleted file mode 100644 index 04997a7db44343ff69634a1e3984cfa9866c44ed..0000000000000000000000000000000000000000 --- a/jobs/explicit_mpi_chebyshev.sh +++ /dev/null @@ -1,40 +0,0 @@ -#!/usr/bin/env bash -## name of job -#SBATCH -J explicit_chebyshev_50M_10_node -#SBATCH -p longq -## Resources: (nodes, procs, tasks, walltime, ... etc) -#SBATCH -N 10 -#SBATCH -c 24 -# # standard output message -#SBATCH -o explicit_chebyshev_50M_10_node%j.out -# # output error message -#SBATCH -e explicit_chebyshev_50M_10_node%j.err -#SBATCH --mail-type=ALL --mail-user=martin.khannouz@inria.fr -## modules to load for the job -module purge -module load slurm -module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 -. /home/mkhannou/spack/share/spack/setup-env.sh -spack load fftw -spack load hwloc -spack load openmpi -spack load starpu@svn-trunk+fxt -## variable for the job -export GROUP_SIZE=500 -export TREE_HEIGHT=8 -export NB_NODE=$SLURM_JOB_NUM_NODES -export STARPU_NCPU=24 -export NB_PARTICLE_PER_NODE=5000000 -export STARPU_FXT_PREFIX=`pwd`/ -echo "===== Explicit MPI ====" -echo "my jobID: " $SLURM_JOB_ID -echo "Model: cube" -echo "Nb node: " $NB_NODE -echo "Nb thread: " $STARPU_NCPU -echo "Tree height: " $TREE_HEIGHT -echo "Group size: " $GROUP_SIZE -echo "Algorithm: explicit" -echo "Particle per node: " $NB_PARTICLE_PER_NODE -echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedMpiChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average - diff --git a/jobs/implicit_10N_chebyshev.sh b/jobs/implicit_10N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..5c68159a1469ba222e728877f650df154e4bf55e --- /dev/null +++ b/jobs/implicit_10N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J implicit_50M_10N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 10 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e implicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=5000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: implicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/implicit_1N_chebyshev.sh b/jobs/implicit_1N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..67ba8d52db1f99b56c46c3589893690cba52ff37 --- /dev/null +++ b/jobs/implicit_1N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J implicit_50M_1N +#SBATCH -p defq +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 1 +#SBATCH -c 24 +#SBATCH --time=02:00:00 +# # output error message +#SBATCH -e implicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=50000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: implicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/implicit_2N_chebyshev.sh b/jobs/implicit_2N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..c119a54ce080203826543cd6b8151aaa71e9ea96 --- /dev/null +++ b/jobs/implicit_2N_chebyshev.sh @@ -0,0 +1,54 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J implicit_50M_2N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 2 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e implicit_50M_2N_%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=25000000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: implicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result + diff --git a/jobs/implicit_4N_chebyshev.sh b/jobs/implicit_4N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..bc4cbb6433bfa0b47702ede89fbb402a3281a98e --- /dev/null +++ b/jobs/implicit_4N_chebyshev.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J implicit_50M_4N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 4 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e implicit_chebyshev_50M_10_node%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=12500000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: implicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result diff --git a/jobs/implicit_8N_chebyshev.sh b/jobs/implicit_8N_chebyshev.sh new file mode 100644 index 0000000000000000000000000000000000000000..359891f3a24d7760c0b03eedc98620c20d428605 --- /dev/null +++ b/jobs/implicit_8N_chebyshev.sh @@ -0,0 +1,54 @@ +#!/usr/bin/env bash +## name of job +#SBATCH -J implicit_50M_8N +#SBATCH -p special +## Resources: (nodes, procs, tasks, walltime, ... etc) +#SBATCH -N 8 +#SBATCH -c 24 +#SBATCH --time=00:30:00 +# # output error message +#SBATCH -e implicit_50M_8N_%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job +module purge +module load slurm +module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 +. /home/mkhannou/spack/share/spack/setup-env.sh +spack load fftw +spack load hwloc +spack load openmpi +spack load starpu@svn-trunk+fxt +## variable for the job +export GROUP_SIZE=500 +export TREE_HEIGHT=8 +export NB_NODE=$SLURM_JOB_NUM_NODES +export STARPU_NCPU=24 +export NB_PARTICLE_PER_NODE=6250000 +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: implicit" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average >> $FINAL_DIR/stdout + +#Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result + diff --git a/jobs/implicit_chebyshev.sh b/jobs/implicit_chebyshev.sh deleted file mode 100644 index f00fb735923461a8cbf6bcfe89f0d66e9b0ce2f4..0000000000000000000000000000000000000000 --- a/jobs/implicit_chebyshev.sh +++ /dev/null @@ -1,41 +0,0 @@ -#!/usr/bin/env bash -## name of job -#SBATCH -J implicit_chebyshev_50M_10_node -#SBATCH -p longq -## Resources: (nodes, procs, tasks, walltime, ... etc) -#SBATCH -N 10 -#SBATCH -c 24 -# # standard output message -#SBATCH -o implicit_chebyshev_50M_10_node%j.out -# # output error message -#SBATCH -e implicit_chebyshev_50M_10_node%j.err -#SBATCH --mail-type=ALL --mail-user=martin.khannouz@inria.fr -## modules to load for the job -module purge -module load slurm -module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 -. /home/mkhannou/spack/share/spack/setup-env.sh -spack load fftw -spack load hwloc -spack load openmpi -spack load starpu@svn-trunk+fxt -## variable for the job -export GROUP_SIZE=500 -export TREE_HEIGHT=8 -export NB_NODE=$SLURM_JOB_NUM_NODES -export STARPU_NCPU=24 -export NB_PARTICLE_PER_NODE=5000000 -export STARPU_FXT_PREFIX=`pwd`/ -echo "===== Implicit MPI ====" -echo "my jobID: " $SLURM_JOB_ID -echo "Model: cube" -echo "Nb node: " $NB_NODE -echo "Nb thread: " $STARPU_NCPU -echo "Tree height: " $TREE_HEIGHT -echo "Group size: " $GROUP_SIZE -echo "Algorithm: implicit" -echo "Particle per node: " $NB_PARTICLE_PER_NODE -echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedImplicitChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Average - - diff --git a/jobs/starpu_chebyshev.sh b/jobs/starpu_chebyshev.sh index 2517bed48d09d00f164db5f7b3e8396843a08a2e..51c7c2f504cb4c4e3b4169b8e20cbb448a4f89ca 100644 --- a/jobs/starpu_chebyshev.sh +++ b/jobs/starpu_chebyshev.sh @@ -1,15 +1,16 @@ #!/usr/bin/env bash ## name of job -#SBATCH -J chebyshev_50M_1_node -#SBATCH -p longq +#SBATCH -J starpu_50M +## Queue where the job is executed +#SBATCH -p defq ## Resources: (nodes, procs, tasks, walltime, ... etc) #SBATCH -N 1 #SBATCH -c 24 -# # standard output message -#SBATCH -o chebyshev_50M_1_node%j.out +#SBATCH --time=02:00:00 # # output error message -#SBATCH -e chebyshev_50M_1_node%j.err -#SBATCH --mail-type=ALL --mail-user=martin.khannouz@inria.fr +#SBATCH -e starpu_50M_%j.err +#SBATCH --mail-type=END,FAIL,TIME_LIMIT --mail-user=martin.khannouz@inria.fr +## modules to load for the job module purge module load slurm module add compiler/gcc/5.3.0 tools/module_cat/1.0.0 intel/mkl/64/11.2/2016.0.0 @@ -18,21 +19,38 @@ spack load fftw spack load hwloc spack load openmpi spack load starpu@svn-trunk+fxt -## modules to load for the job +##Setting variable for the job export GROUP_SIZE=500 export TREE_HEIGHT=8 export NB_NODE=$SLURM_JOB_NUM_NODES export STARPU_NCPU=24 export NB_PARTICLE_PER_NODE=50000000 -export STARPU_FXT_PREFIX=`pwd`/ -echo "===== StarPU only =====" -echo "my jobID: " $SLURM_JOB_ID -echo "Model: cube" -echo "Nb node: " $NB_NODE -echo "Nb thread: " $STARPU_NCPU -echo "Tree height: " $TREE_HEIGHT -echo "Group size: " $GROUP_SIZE -echo "Algorithm: starpu" -echo "Particle per node: " $NB_PARTICLE_PER_NODE -echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) -mpiexec -n $NB_NODE ./Build/Tests/Release/testBlockedChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation | grep Kernel +export STARPU_FXT_PREFIX=$SLURM_JOB_ID +export FINAL_DIR="`pwd`/dir_$SLURM_JOB_ID" +mkdir $FINAL_DIR + +## Write data into an stdout file +echo "my jobID: " $SLURM_JOB_ID > $FINAL_DIR/stdout +echo "Model: cube" >> $FINAL_DIR/stdout +echo "Nb node: " $NB_NODE >> $FINAL_DIR/stdout +echo "Nb thread: " $STARPU_NCPU >> $FINAL_DIR/stdout +echo "Tree height: " $TREE_HEIGHT >> $FINAL_DIR/stdout +echo "Group size: " $GROUP_SIZE >> $FINAL_DIR/stdout +echo "Algorithm: starpu" >> $FINAL_DIR/stdout +echo "Particle per node: " $NB_PARTICLE_PER_NODE >> $FINAL_DIR/stdout +echo "Total particles: " $(($NB_PARTICLE_PER_NODE*$NB_NODE)) >> $FINAL_DIR/stdout +./Build/Tests/Release/testBlockedChebyshev -nb $NB_PARTICLE_PER_NODE -bs $GROUP_SIZE -h $TREE_HEIGHT -no-validation >> $FINAL_DIR/stdout + +##Create argument list for starpu_fxt_tool +cd $FINAL_DIR +list_fxt_file=`ls ../$STARPU_FXT_PREFIX*` + +#Clean to only keep trace.rec +mkdir fxt +for i in $list_fxt_file; do + mv $i fxt +done +cd .. + +##Move the result into a directory where all result goes +mv $FINAL_DIR jobs_result