Configuring libkomp to have tracing tools on

libKOMP tracing tool is based on OMPT support to generate KAAPI events in events' trace files. The events captured and recorder are controlled by environment variables.

Assuming you have downloaded the library. The following lines create a new build library and configure the library to use tracing tools + affinity support based on T.H.E protocol with aggregation protocols.

> mkdir build_release_trace
> cd build_release_trace
> cmake ../openmp-kaapi \
-DLIBOMP_OMPT_SUPPORT=on -DLIBOMP_KAAPI_TRACING=on \
-DLIBOMP_USE_HWLOC=true \
-DCMAKE_INSTALL_PREFIX=<your prefix> \ 
-DCMAKE_BUILD_TYPE=release

You should also consider the capability to attach hardware performance counter to task, parallel region or loops. To do that, you should configure the library with PAPI support:

> cmake ../openmp-kaapi \
-DLIBOMP_OMPT_SUPPORT=on -DLIBOMP_KAAPI_TRACING=on \
-DLIBOMP_USE_HWLOC=true \
-DCMAKE_INSTALL_PREFIX=<your prefix>\ 
-DCMAKE_BUILD_TYPE=release\
-DLIBOMP_USE_PAPI=true

Runing your OpenMP code

Once your OpenMP code has been download, then you can run it with environment variables to select which events you want to capture.

The way to run binary is to preload a specific library trace-libomp.so built during compilation if tracing tool is configured.

Following lines run the Kastors's Cholesky factorization on 96 cores with capture of events for work and time counters + all events dealing with computation and OMP specific features.

> N=96; PREFIX=<your prefix>/lib; OMP_NUM_THREADS=$N \
OMP_PLACES="cores($N)" \
LD_LIBRARY_PATH=$PREFIX \
OMP_TOOL=enabled \
KAAPI_RECORD_TRACE=1 \
KAAPI_RECORD_MASK=compute,omp \
KAAPI_TASKPERF_EVENTS=work \
LD_PRELOAD=$PREFIX/trace-libomp.so \
./dpotrf_taskdep -n 16384 -b 256 -i 3

You can found details about events and performance counters here.

How to attach PAPI hardware performance counters?

If you have configured the library with PAPI support, you could record hardware performance counters attached to each task while running. For instance you could specifying (ellipses are same command line as above):

> N=96; PREFIX=<your prefix>/lib; <...> \
KAAPI_RECORD_MASK=compute,omp,perfctr \
KAAPI_TASKPERF_EVENTS=work,PAPI_TOT_CYC,PAPI_TOT_INS \
<...> \
./dpotrf_taskdep -n 16384 -b 256 -i 3

Do not forget to add ````perfctr``` in the set of events.

Processing generated data

All events or performances counters are registered per thread into files located in /tmp. The typical outputs of the previous lines are:

[OMP-TRACE] ompt-trace ompt_tool initialized
[OMP-TRACE] kaapi tracing version: Git last commit:624e6eb649329445+
...
##Progname Size Blocksize Iterations Threads Gflops(Mean) Stddev 
dpotrf_taskdep 16384 256 3 192 1714.021480 47.314433
#Experience summarry : avg : 1714.021480 :: std : 47.314433 :: min : 1653.340947 :: max : 1768.782805 :: median : 1719.940687

[OMP-TRACE] kaapi tracing tool closed.

All events or performances counters are registered per thread into the files:

/tmp/event.$USER.<pid>.<tid>.evt

where pid is the process id of the spawned processus and tid s are the thread identifiers that ranges from 0 to $OMP_NUM_THREADS-1, i.e. the maximal number of threads used at runtime.

These files contains all the usefull informations to generate:

a DAG of the dependencies between tasks. One graph is generated per parallel region.
a GANTT of the thread activities between the start of the program until its ends.
a CSV of tasks' executions and threads' activities
a dump of performance counters.

All these features are available through katracereader installed in /bin repository.

Generating a CSV from internal trace files

CSV generation is now based on katracereader passing option --csv: For instance:

> katracereader --csv /tmp/events.gautier.175535.*
...
*** File 'parallels.csv' generated
*** File 'threads.csv' generated
*** File 'tasks.csv' generated

Each csv file contains informations about specific feature of the OpenMP program. Here, tasks.csv defines start/end of each tasks, its names, the list of value for each counter attached.. Each of csv files begins by a CSV header parsed by R function 'read.csv'.

Plotting a Gantt chart with R

The format of the file 'tasks.csv' is given by the following R function to read it:

> library(dplyr);
> readtrace <- function (filename)
{
   df <- read.csv(filename, header=TRUE, sep=",", strip.white=TRUE);
   df <- df %>% filter((Explicit==1)) %>% as.data.frame();
   df$Start <- df$Start*1e-9; # Convert ns to second
   df$End <- df$End*1e-9;
   df$Duration <- df$Duration*1e-9;
   df;
}
> df <- readtrace("/Users/thierry/tasks.csv");
> head(df);
  Resource Numa      Start        End    Duration Explicit Aff Strict Tag                               Name
1       46    1 1499947765 1499947765 0.002393115        1   2      1   1 func: dplgsy file: unknown line: 0
2       15    0 1499947765 1499947765 0.002733835        1   2      1   0 func: dplgsy file: unknown line: 0
3        8    0 1499947765 1499947765 0.002776491        1   2      1   0 func: dplgsy file: unknown line: 0
4       43    1 1499947765 1499947765 0.002839405        1   2      1   1 func: dplgsy file: unknown line: 0
5       51    2 1499947765 1499947765 0.002407908        1   2      1   2 func: dplgsy file: unknown line: 0
6       58    2 1499947765 1499947765 0.002131242        1   2      1   2 func: dplgsy file: unknown line: 0
  TaskId      Work PAPI_TOT_CYC PAPI_TOT_INS                           Origin
1  75264 0.0023888      1681121      1390710 /Users/thierry/tmp/csv/tasks.csv
2  75008 0.0027351      1003768       890397 /Users/thierry/tmp/csv/tasks.csv
3  76032 0.0027742      1585495      1390256 /Users/thierry/tmp/csv/tasks.csv
4  76288 0.0028371      1420764      1388990 /Users/thierry/tmp/csv/tasks.csv
5  75520 0.0024026      1357263      1390318 /Users/thierry/tmp/csv/tasks.csv
6  76544 0.0021288      1554343      1390194 /Users/thierry/tmp/csv/tasks.csv

With the following meaning:

Resource: the id of the thread executing the task
Numa: the numa node attached to the thread
Start,End,Duration: of the task
Explicit==0 if implicit OpenMP task else 1
Aff, Strict, Tag: affinity given to the task
Name: the task' name
TaskId: a system wide task identifier
Work, PAPI_TOT_CYC, PAPI_TOT_INST: the performance counters specifyed in the variable KAAPI_TASKPERF_EVENTS

Once csv file loaded, try:

library(ggplot2);

# helper: convert s to the date
date<-function(d) { as.POSIXct(d, origin="1970-01-01"); }

# theplot
ggplot() +
  theme_bw(base_size=16) +
   xlab("Time [s]") +
   ylab("Thread Identification") +
   scale_fill_brewer(palette = "Set1") +
   theme (
       plot.margin = unit(c(0,0,0,0), "cm"),
       legend.spacing = unit(.1, "line"),
       panel.grid.major = element_blank(),
       panel.spacing=unit(0, "cm"),
       panel.grid=element_line(0, "cm"),
       legend.position = "bottom",
       legend.title =  element_text("Helvetica")
   ) +
   guides(fill = guide_legend(nrow = 1)) +
   geom_rect(data=df, alpha=1, aes(fill=Name,
                                 xmin=date(Start),
                                 xmax=date(End),
                                 ymin=Resource,
                                 ymax=Resource+0.9)) +
   scale_y_reverse();

You should obtain: Where: * by zooming you can see other tasks than dgemm (red)! * there is 4 computations while only 3 are specified because the benchmark makes one extra warmup computation

Generating a Paje file format

You could also generate using katracereader a Paje file format. You can use the Vite application to display the Gantt.

For instance:

> katracereader --vite /tmp/events.gautier.175535.*

The output file vite-gantt.trace may be visualized trough Vite.

One step further: plotting distribution of the task execution times

df %>%
 ggplot() + geom_histogram(aes(x=Duration, fill=Name), bins=100) + facet_wrap(~Name, nrow=1, scales="free") + theme_bw(base_size=12);

Using Katracereader

Configuring libkomp to have tracing tools on

Runing your OpenMP code

How to attach PAPI hardware performance counters?

Processing generated data

Generating a CSV from internal trace files

Plotting a Gantt chart with R

Generating a Paje file format

One step further: plotting distribution of the task execution times

Acknowledgment

Comments

How to attach PAPI hardware performance counters?

Generating a CSV from internal trace files

Generating a Paje file format

One step further: plotting distribution of the task execution times

Acknowledgment

Admin message

Using Katracereader

Configuring libkomp to have tracing tools on

Runing your OpenMP code

How to attach PAPI hardware performance counters?

Processing generated data

Generating a CSV from internal trace files

Plotting a Gantt chart with R

Generating a Paje file format

One step further: plotting distribution of the task execution times

Acknowledgment

Comments