Mise à jour terminée. Pour connaître les apports de la version 13.8.4 par rapport à notre ancienne version vous pouvez lire les "Release Notes" suivantes :
https://about.gitlab.com/releases/2021/02/11/security-release-gitlab-13-8-4-released/
https://about.gitlab.com/releases/2021/02/05/gitlab-13-8-3-released/

Commit 556840cf authored by PRUVOST Florent's avatar PRUVOST Florent

update doc concerning trace option name

trailing whitespaces
parent 02a91911
......@@ -24,7 +24,7 @@
@node Compilation configuration
@section Compilation configuration
The following arguments can be given to the @command{cmake <path to source
The following arguments can be given to the @command{cmake <path to source
directory>} script.
In this chapter, the following convention is used:
......@@ -34,24 +34,24 @@ In this chapter, the following convention is used:
@item
@option{var} is a string and the correct value or an example will be given,
@item
@option{trigger} is an CMake option and the correct value is @code{ON} or
@option{trigger} is an CMake option and the correct value is @code{ON} or
@code{OFF}.
@end itemize
Using CMake there are several ways to give options:
@enumerate
@item directly as CMake command line arguments
@item invoque @command{cmake <path to source directory>} once and then use
@command{ccmake <path to source directory>} to edit options through a
minimalist gui (required
@item invoque @command{cmake <path to source directory>} once and then use
@command{ccmake <path to source directory>} to edit options through a
minimalist gui (required
@samp{cmake-curses-gui} installed on a Linux system)
@item invoque @command{cmake-gui} command and fill information about the
location of the sources and where to build the project, then you have
access to options through a user-friendly Qt interface (required
@item invoque @command{cmake-gui} command and fill information about the
location of the sources and where to build the project, then you have
access to options through a user-friendly Qt interface (required
@samp{cmake-qt-gui} installed on a Linux system)
@end enumerate
Example of configuration using the command line
Example of configuration using the command line
@example
cmake ~/chameleon/ -DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=~/install \
......@@ -59,11 +59,11 @@ cmake ~/chameleon/ -DCMAKE_BUILD_TYPE=Debug \
-DCHAMELEON_USE_MAGMA=ON \
-DCHAMELEON_USE_MPI=ON \
-DBLA_VENDOR=Intel10_64lp \
-DSTARPU_DIR=~/install/starpu-1.1 \
-DCHAMELEON_USE_FXT=ON
-DSTARPU_DIR=~/install/starpu-1.1 \
-DCHAMELEON_ENABLE_TRACING=ON
@end example
You can get the full list of options with @option{-L[A][H]} options of
You can get the full list of options with @option{-L[A][H]} options of
@command{cmake} command:
@example
cmake -LH <path to source directory>
......@@ -80,14 +80,14 @@ cmake -LH <path to source directory>
@table @code
@item -DCMAKE_INSTALL_PREFIX=@option{path} (default:@option{path=/usr/local})
Install directory used by @code{make install} where some headers and libraries
Install directory used by @code{make install} where some headers and libraries
will be copied.
Permissions have to be granted to write onto @option{path} during @code{make
Permissions have to be granted to write onto @option{path} during @code{make
install} step.
@item -DCMAKE_BUILD_TYPE=@option{var} (default: @option{Release})
Define the build type and the compiler optimization level.
The possible values for @option{var} are:
The possible values for @option{var} are:
@table @code
@item empty
@item Debug
......@@ -97,7 +97,7 @@ The possible values for @option{var} are:
@end table
@item -DBUILD_SHARED_LIBS=@option{trigger} (default:@option{OFF})
Indicate wether or not CMake has to build CHAMELEON static (@option{OFF}) or
Indicate wether or not CMake has to build CHAMELEON static (@option{OFF}) or
shared (@option{ON}) libraries.
@end table
......@@ -105,7 +105,7 @@ shared (@option{ON}) libraries.
@node CHAMELEON options
@subsection CHAMELEON options
List of CHAMELEON options that can be enabled/disabled (value=@code{ON}
List of CHAMELEON options that can be enabled/disabled (value=@code{ON}
or @code{OFF}):
@table @code
......@@ -116,40 +116,40 @@ to link with StarPU library (runtime system)
to link with QUARK library (runtime system)
@item @option{-DCHAMELEON_USE_CUDA}=@option{trigger} (default: @code{OFF})
to link with CUDA runtime (implementation paradigm for accelerated codes on
GPUs) and cuBLAS library (optimized BLAS kernels on GPUs), can only be used with
to link with CUDA runtime (implementation paradigm for accelerated codes on
GPUs) and cuBLAS library (optimized BLAS kernels on GPUs), can only be used with
StarPU
@item @option{-DCHAMELEON_USE_MAGMA}=@option{trigger} (default: @code{OFF})
to link with MAGMA library (kernels on GPUs, higher level than cuBLAS), can only
to link with MAGMA library (kernels on GPUs, higher level than cuBLAS), can only
be used with StarPU
@item @option{-DCHAMELEON_USE_MPI}=@option{trigger} (default: @code{OFF})
to link with MPI library (message passing implementation for use of multiple
to link with MPI library (message passing implementation for use of multiple
nodes with distributed memory), can only be used with StarPU
@item @option{-DCHAMELEON_USE_FXT}=@option{trigger} (default: @code{OFF})
to link with FxT library (trace execution of kernels on workers), can only be
used with StarPU
@item @option{-DCHAMELEON_ENABLE_TRACING}=@option{trigger} (default: @code{OFF})
to enable trace generation during execution of timing drivers.
It requires StarPU to be linked with FxT library (trace execution of kernels on workers).
@item @option{-DCHAMELEON_SIMULATION=trigger} (default: @code{OFF})
to enable simulation mode, means CHAMELEON will not really execute tasks,
see details in section @ref{Use simulation mode with StarPU-SimGrid}.
This option must be used with StarPU compiled with
@uref{http://simgrid.gforge.inria.fr/, SimGrid} allowing to guess the
to enable simulation mode, means CHAMELEON will not really execute tasks,
see details in section @ref{Use simulation mode with StarPU-SimGrid}.
This option must be used with StarPU compiled with
@uref{http://simgrid.gforge.inria.fr/, SimGrid} allowing to guess the
execution time on any architecture.
This feature should be used to make experiments on the scheduler behaviors and
This feature should be used to make experiments on the scheduler behaviors and
performances not to produce solutions of linear systems.
@item @option{-DCHAMELEON_ENABLE_DOCS=trigger} (default: @code{ON})
to control build of the documentation contained in @file{docs/} sub-directory
@item @option{-DCHAMELEON_ENABLE_EXAMPLE=trigger} (default: @code{ON})
to control build of the examples executables (API usage)
to control build of the examples executables (API usage)
contained in @file{example/} sub-directory
@item @option{-DCHAMELEON_ENABLE_TESTING=trigger} (default: @code{ON})
to control build of testing executables (numerical check) contained in
to control build of testing executables (numerical check) contained in
@file{testing/} sub-directory
@item @option{-DCHAMELEON_ENABLE_TIMING=trigger} (default: @code{ON})
to control build of timing executables (performances check) contained in
to control build of timing executables (performances check) contained in
@file{timing/} sub-directory
@item @option{-DCHAMELEON_PREC_S=trigger} (default: @code{ON})
......@@ -159,13 +159,13 @@ to enable the support of double arithmetic precision (double in C)
@item @option{-DCHAMELEON_PREC_C=trigger} (default: @code{ON})
to enable the support of complex arithmetic precision (complex in C)
@item @option{-DCHAMELEON_PREC_Z=trigger} (default: @code{ON})
to enable the support of double complex arithmetic precision (double complex
to enable the support of double complex arithmetic precision (double complex
in C)
@item @option{-DBLAS_VERBOSE=trigger} (default: @code{OFF})
to make BLAS library discovery verbose
@item @option{-DLAPACK_VERBOSE=trigger} (default: @code{OFF})
to make LAPACK library discovery verbose (automatically enabled if
to make LAPACK library discovery verbose (automatically enabled if
@option{BLAS_VERBOSE=@code{ON}})
@end table
......@@ -183,9 +183,9 @@ The possible values for @option{var} are:
@item Generic
@item ...
@end table
to force CMake to find a specific BLAS library, see the full list of BLA_VENDOR
to force CMake to find a specific BLAS library, see the full list of BLA_VENDOR
in @file{FindBLAS.cmake} in @file{cmake_modules/morse/find}.
By default @option{BLA_VENDOR} is empty so that CMake tries to detect all
By default @option{BLA_VENDOR} is empty so that CMake tries to detect all
possible BLAS vendor with a preference for Intel MKL.
@end table
......@@ -198,11 +198,11 @@ directory of the LIBNAME library headers installation
@item @option{-DLIBNAME_LIBDIR=@option{path}} (default: empty)
directory of the LIBNAME libraries (.so, .a, .dylib, etc) installation
@end table
LIBNAME can be one of the following: BLAS - CBLAS - FXT - HWLOC -
LIBNAME can be one of the following: BLAS - CBLAS - FXT - HWLOC -
LAPACK - LAPACKE - MAGMA - QUARK - STARPU - TMG.
See paragraph about @ref{Dependencies detection} for details.
Libraries detected with an official CMake module (see module files in
Libraries detected with an official CMake module (see module files in
@file{CMAKE_ROOT/Modules/}):
@itemize @bullet
@item CUDA
......@@ -210,7 +210,7 @@ Libraries detected with an official CMake module (see module files in
@item Threads
@end itemize
Libraries detected with CHAMELEON cmake modules (see module files in
Libraries detected with CHAMELEON cmake modules (see module files in
@file{cmake_modules/morse/find/} directory of CHAMELEON sources):
@itemize @bullet
@item BLAS
......@@ -222,26 +222,26 @@ Libraries detected with CHAMELEON cmake modules (see module files in
@item MAGMA
@item QUARK
@item STARPU
@item TMG
@item TMG
@end itemize
@node Dependencies detection
@section Dependencies detection
You have different choices to detect dependencies on your system, either by
setting some environment variables containing paths to the libs and headers or
by specifying them directly at cmake configure.
You have different choices to detect dependencies on your system, either by
setting some environment variables containing paths to the libs and headers or
by specifying them directly at cmake configure.
Different cases :
@enumerate
@item detection of dependencies through environment variables:
@item detection of dependencies through environment variables:
@itemize @bullet
@item @env{LD_LIBRARY_PATH} environment variable should contain the list of
paths
@item @env{LD_LIBRARY_PATH} environment variable should contain the list of
paths
where to find the libraries:
@example
@example
export @env{LD_LIBRARY_PATH}=$@env{LD_LIBRARY_PATH}:path/to/your/libs
@end example
@item @env{INCLUDE} environment variable should contain the list of paths
@item @env{INCLUDE} environment variable should contain the list of paths
where to find the header files of libraries
@example
export @env{INCLUDE}=$@env{INCLUDE}:path/to/your/headers
......@@ -250,27 +250,27 @@ where to find the header files of libraries
@item detection with user's given paths:
@itemize @bullet
@item you can specify the path at cmake configure by invoking
@example
cmake <path to SOURCE_DIR> -DLIBNAME_DIR=path/to/your/lib
@item you can specify the path at cmake configure by invoking
@example
cmake <path to SOURCE_DIR> -DLIBNAME_DIR=path/to/your/lib
@end example
where LIB stands for the name of the lib to look for, example
@example
cmake <path to SOURCE_DIR> -DSTARPU_DIR=path/to/starpudir \
-DCBLAS_DIR= ...
@end example
@item it is also possible to specify headers and library directories
@item it is also possible to specify headers and library directories
separately, example
@example
cmake <path to SOURCE_DIR> \
-DSTARPU_INCDIR=path/to/libstarpu/include/starpu/1.1 \
-DSTARPU_LIBDIR=path/to/libstarpu/lib
@end example
@item Note BLAS and LAPACK detection can be tedious so that we provide a
verbose mode. Use @option{-DBLAS_VERBOSE=ON} or @option{-DLAPACK_VERBOSE=ON} to
@item Note BLAS and LAPACK detection can be tedious so that we provide a
verbose mode. Use @option{-DBLAS_VERBOSE=ON} or @option{-DLAPACK_VERBOSE=ON} to
enable it.
@end itemize
@end enumerate
......@@ -280,49 +280,49 @@ enable it.
@node Use FxT profiling through StarPU
@section Use FxT profiling through StarPU
StarPU can generate its own trace log files by compiling it with the
@option{--with-fxt}
option at the configure step (you can have to specify the directory where you
installed FxT by giving @option{--with-fxt=...} instead of @option{--with-fxt}
alone).
By doing so, traces are generated after each execution of a program which uses
StarPU in the directory pointed by the @env{STARPU_FXT_PREFIX} environment
variable. Example:
StarPU can generate its own trace log files by compiling it with the
@option{--with-fxt}
option at the configure step (you can have to specify the directory where you
installed FxT by giving @option{--with-fxt=...} instead of @option{--with-fxt}
alone).
By doing so, traces are generated after each execution of a program which uses
StarPU in the directory pointed by the @env{STARPU_FXT_PREFIX} environment
variable. Example:
@example
export @env{STARPU_FXT_PREFIX}=/home/yourname/fxt_files/
@end example
When executing a @command{./timing/...} CHAMELEON program, if it has been
enabled (StarPU compiled with FxT and @option{-DCHAMELEON_USE_FXT=ON}), you
can give the option @option{--trace} to tell the program to generate trace log
When executing a @command{./timing/...} CHAMELEON program, if it has been
enabled (StarPU compiled with FxT and @option{-DCHAMELEON_ENABLE_TRACING=ON}), you
can give the option @option{--trace} to tell the program to generate trace log
files.
Finally, to generate the trace file which can be opened with
@uref{http://vite.gforge.inria.fr/, Vite} program, you have to use the
@command{starpu_fxt_tool} executable of StarPU.
This tool should be in @file{path/to/your/install/starpu/bin}.
You can use it to generate the trace file like this:
Finally, to generate the trace file which can be opened with
@uref{http://vite.gforge.inria.fr/, Vite} program, you have to use the
@command{starpu_fxt_tool} executable of StarPU.
This tool should be in @file{path/to/your/install/starpu/bin}.
You can use it to generate the trace file like this:
@itemize @bullet
@item @command{path/to/your/install/starpu/bin/starpu_fxt_tool -i prof_filename}
There is one file per mpi processus (prof_filename_0, prof_filename_1 ...).
To generate a trace of mpi programs you can call it like this:
@item @command{path/to/your/install/starpu/bin/starpu_fxt_tool -i
@item @command{path/to/your/install/starpu/bin/starpu_fxt_tool -i
prof_filename*}
The trace file will be named paje.trace (use -o option to specify an output
The trace file will be named paje.trace (use -o option to specify an output
name).
@end itemize
@end itemize
@node Use simulation mode with StarPU-SimGrid
@section Use simulation mode with StarPU-SimGrid
Simulation mode can be enabled by setting the cmake option
Simulation mode can be enabled by setting the cmake option
@option{-DCHAMELEON_SIMULATION=ON}.
This mode allows you to simulate execution of algorithms with StarPU compiled
This mode allows you to simulate execution of algorithms with StarPU compiled
with @uref{http://simgrid.gforge.inria.fr/, SimGrid}.
To do so, we provide some perfmodels in the @file{simucore/perfmodels/}
To do so, we provide some perfmodels in the @file{simucore/perfmodels/}
directory of CHAMELEON sources.
To use these perfmodels, please set the following
@itemize @bullet
......@@ -330,12 +330,12 @@ To use these perfmodels, please set the following
@example
@code{<path to SOURCE_DIR>/simucore/perfmodels}
@end example
@item @env{STARPU_HOSTNAME} environment variable to the name of the machine to
@item @env{STARPU_HOSTNAME} environment variable to the name of the machine to
simulate. For example, on our platform (PlaFRIM) with GPUs at Inria Bordeaux
@example
@env{STARPU_HOSTNAME}=mirage
@end example
Note that only POTRF kernels with block sizes of 320 or 960 (simple and double
Note that only POTRF kernels with block sizes of 320 or 960 (simple and double
precision) on mirage machine are available for now.
Database of models is subject to change, it should be enrich in a near future.
@end itemize
......@@ -11,12 +11,12 @@
* Build process of CHAMELEON::
@end menu
CHAMELEON can be built and installed by the standard means of CMake
CHAMELEON can be built and installed by the standard means of CMake
(@uref{http://www.cmake.org/}).
General information about CMake, as well as installation binaries and CMake
source code are available from
General information about CMake, as well as installation binaries and CMake
source code are available from
@uref{http://www.cmake.org/cmake/resources/software.html}.
The following chapter is intended to briefly remind how these tools can be used
The following chapter is intended to briefly remind how these tools can be used
to install CHAMELEON.
@node Downloading CHAMELEON
......@@ -31,11 +31,11 @@ to install CHAMELEON.
@node Getting Sources
@subsection Getting Sources
The latest official release tarballs of CHAMELEON sources are available for
download from
The latest official release tarballs of CHAMELEON sources are available for
download from
@uref{http://morse.gforge.inria.fr/chameleon-0.9.1.tar.gz, chameleon-0.9.1}.
@c The latest development snapshot is available from
@c The latest development snapshot is available from
@c @uref{http://hydra.bordeaux.inria.fr/job/hiepacs/morse-cmake/tarball/latest/
@c download-by-type/file/source-dist}.
......@@ -57,141 +57,141 @@ download from
@node a BLAS implementation
@subsubsection a BLAS implementation
@uref{http://www.netlib.org/blas/, BLAS} (Basic Linear Algebra Subprograms),
are a de facto standard for basic linear algebra operations such as vector and
matrix multiplication.
FORTRAN implementation of BLAS is available from Netlib.
Also, C implementation of BLAS is included in GSL (GNU Scientific Library).
Both these implementations are reference implementation of BLAS, are not
optimized for modern processor architectures and provide an order of magnitude
lower performance than optimized implementations.
Highly optimized implementations of BLAS are available from many hardware
vendors, such as Intel MKL and AMD ACML.
Fast implementations are also available as academic packages, such as ATLAS and
Goto BLAS.
@uref{http://www.netlib.org/blas/, BLAS} (Basic Linear Algebra Subprograms),
are a de facto standard for basic linear algebra operations such as vector and
matrix multiplication.
FORTRAN implementation of BLAS is available from Netlib.
Also, C implementation of BLAS is included in GSL (GNU Scientific Library).
Both these implementations are reference implementation of BLAS, are not
optimized for modern processor architectures and provide an order of magnitude
lower performance than optimized implementations.
Highly optimized implementations of BLAS are available from many hardware
vendors, such as Intel MKL and AMD ACML.
Fast implementations are also available as academic packages, such as ATLAS and
Goto BLAS.
The standard interface to BLAS is the FORTRAN interface.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference BLAS from NETLIB and the Intel MKL 11.1 from Intel distribution
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference BLAS from NETLIB and the Intel MKL 11.1 from Intel distribution
2013_sp1.
@node CBLAS
@subsubsection CBLAS
@uref{http://www.netlib.org/blas/#_cblas, CBLAS} is a C language interface to
@uref{http://www.netlib.org/blas/#_cblas, CBLAS} is a C language interface to
BLAS.
Most commercial and academic implementations of BLAS also provide CBLAS.
Netlib provides a reference implementation of CBLAS on top of FORTRAN BLAS
(Netlib CBLAS).
Most commercial and academic implementations of BLAS also provide CBLAS.
Netlib provides a reference implementation of CBLAS on top of FORTRAN BLAS
(Netlib CBLAS).
Since GSL is implemented in C, it naturally provides CBLAS.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference CBLAS from NETLIB and the Intel MKL 11.1 from Intel distribution
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference CBLAS from NETLIB and the Intel MKL 11.1 from Intel distribution
2013_sp1.
@node a LAPACK implementation
@subsubsection a LAPACK implementation
@uref{http://www.netlib.org/lapack/, LAPACK} (Linear Algebra PACKage) is a
software library for numerical linear algebra, a successor of LINPACK and
EISPACK and a predecessor of CHAMELEON.
LAPACK provides routines for solving linear systems of equations, linear least
square problems, eigenvalue problems and singular value problems.
@uref{http://www.netlib.org/lapack/, LAPACK} (Linear Algebra PACKage) is a
software library for numerical linear algebra, a successor of LINPACK and
EISPACK and a predecessor of CHAMELEON.
LAPACK provides routines for solving linear systems of equations, linear least
square problems, eigenvalue problems and singular value problems.
Most commercial and academic BLAS packages also provide some LAPACK routines.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference LAPACK from NETLIB and the Intel MKL 11.1 from Intel distribution
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference LAPACK from NETLIB and the Intel MKL 11.1 from Intel distribution
2013_sp1.
@node LAPACKE
@subsubsection LAPACKE
@uref{http://www.netlib.org/lapack/, LAPACKE} is a C language interface to
LAPACK (or CLAPACK).
It is produced by Intel in coordination with the LAPACK team and is available
in source code from Netlib in its original version (Netlib LAPACKE) and from
CHAMELEON website in an extended version (LAPACKE for CHAMELEON).
In addition to implementing the C interface, LAPACKE also provides routines
which automatically handle workspace allocation, making the use of LAPACK much
@uref{http://www.netlib.org/lapack/, LAPACKE} is a C language interface to
LAPACK (or CLAPACK).
It is produced by Intel in coordination with the LAPACK team and is available
in source code from Netlib in its original version (Netlib LAPACKE) and from
CHAMELEON website in an extended version (LAPACKE for CHAMELEON).
In addition to implementing the C interface, LAPACKE also provides routines
which automatically handle workspace allocation, making the use of LAPACK much
more convenient.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference LAPACKE from NETLIB.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference LAPACKE from NETLIB.
A stand-alone version of LAPACKE is required.
@node libtmg
@subsubsection libtmg
@uref{http://www.netlib.org/lapack/, libtmg} is a component of the LAPACK
library, containing routines for generation
of input matrices for testing and timing of LAPACK.
The testing and timing suites of LAPACK require libtmg, but not the library
@uref{http://www.netlib.org/lapack/, libtmg} is a component of the LAPACK
library, containing routines for generation
of input matrices for testing and timing of LAPACK.
The testing and timing suites of LAPACK require libtmg, but not the library
itself. Note that the LAPACK library can be built and used without libtmg.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference TMG from NETLIB and the Intel MKL 11.1 from Intel distribution
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the reference TMG from NETLIB and the Intel MKL 11.1 from Intel distribution
2013_sp1.
@node QUARK
@subsubsection QUARK
@uref{http://icl.cs.utk.edu/quark/, QUARK} (QUeuing And Runtime for Kernels)
provides a library that enables the dynamic execution of tasks with data
dependencies in a multi-core, multi-socket, shared-memory environment.
One of QUARK or StarPU Runtime systems has to be enabled in order to schedule
@uref{http://icl.cs.utk.edu/quark/, QUARK} (QUeuing And Runtime for Kernels)
provides a library that enables the dynamic execution of tasks with data
dependencies in a multi-core, multi-socket, shared-memory environment.
One of QUARK or StarPU Runtime systems has to be enabled in order to schedule
tasks on the architecture.
If QUARK is enabled then StarPU is disabled and conversely.
Note StarPU is enabled by default.
When CHAMELEON is linked with QUARK, it is not possible to exploit neither
When CHAMELEON is linked with QUARK, it is not possible to exploit neither
CUDA (for GPUs) nor MPI (distributed-memory environment).
You can use StarPU to do so.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
the QUARK library from PLASMA release between versions 2.5.0 and 2.6.0.
@node StarPU
@subsubsection StarPU
@uref{http://runtime.bordeaux.inria.fr/StarPU/, StarPU} is a task programming
@uref{http://runtime.bordeaux.inria.fr/StarPU/, StarPU} is a task programming
library for hybrid architectures.
StarPU handles run-time concerns such as:
@itemize @bullet
@item Task dependencies
@item Optimized heterogeneous scheduling
@item Optimized data transfers and replication between main memory and discrete
@item Optimized data transfers and replication between main memory and discrete
memories
@item Optimized cluster communications
@end itemize
StarPU can be used to benefit from GPUs and distributed-memory environment.
One of QUARK or StarPU runtime system has to be enabled in order to schedule
One of QUARK or StarPU runtime system has to be enabled in order to schedule
tasks on the architecture.
If StarPU is enabled then QUARK is disabled and conversely.
Note StarPU is enabled by default.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
StarPU-1.1 releases.
@node hwloc
@subsubsection hwloc
@uref{http://www.open-mpi.org/projects/hwloc/, hwloc} (Portable Hardware
Locality) is a software package for accessing the topology of a multicore
system including components like: cores, sockets, caches and NUMA nodes.
@c The topology discovery library, @code{hwloc}, is not mandatory to use StarPU
@c but strongly recommended.
It allows to increase performance, and to perform some topology aware
@uref{http://www.open-mpi.org/projects/hwloc/, hwloc} (Portable Hardware
Locality) is a software package for accessing the topology of a multicore
system including components like: cores, sockets, caches and NUMA nodes.
@c The topology discovery library, @code{hwloc}, is not mandatory to use StarPU
@c but strongly recommended.
It allows to increase performance, and to perform some topology aware
scheduling.
@code{hwloc} is available in major distributions and for most OSes and can be
@code{hwloc} is available in major distributions and for most OSes and can be
downloaded from @uref{http://www.open-mpi.org/software/hwloc}.
@strong{Caution about the compatibility:} hwloc should be compatible with the
@strong{Caution about the compatibility:} hwloc should be compatible with the
version of StarPU used.
@node pthread
@subsubsection pthread
POSIX threads library is required to run CHAMELEON on Unix-like systems.
It is a standard component of any such system.
POSIX threads library is required to run CHAMELEON on Unix-like systems.
It is a standard component of any such system.
@comment Windows threads are used on Microsoft Windows systems.
@node Optional dependencies
......@@ -207,71 +207,71 @@ It is a standard component of any such system.
@node OpenMPI
@subsubsection OpenMPI
@uref{http://www.open-mpi.org/, OpenMPI} is an open source Message Passing
Interface implementation for execution on multiple nodes with
@uref{http://www.open-mpi.org/, OpenMPI} is an open source Message Passing
Interface implementation for execution on multiple nodes with
distributed-memory environment.
MPI can be enabled only if the runtime system chosen is StarPU (default).
To use MPI through StarPU, it is necessary to compile StarPU with MPI
To use MPI through StarPU, it is necessary to compile StarPU with MPI
enabled.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
OpenMPI releases from versions 1.4 to 1.6.
@node Nvidia CUDA Toolkit
@subsubsection Nvidia CUDA Toolkit
@uref{https://developer.nvidia.com/cuda-toolkit, Nvidia CUDA Toolkit} provides
a
comprehensive development environment for C and C++ developers building
GPU-accelerated applications.
CHAMELEON can use a set of low level optimized kernels coming from cuBLAS to
@uref{https://developer.nvidia.com/cuda-toolkit, Nvidia CUDA Toolkit} provides
a
comprehensive development environment for C and C++ developers building
GPU-accelerated applications.
CHAMELEON can use a set of low level optimized kernels coming from cuBLAS to
accelerate computations on GPUs.
The @uref{http://docs.nvidia.com/cuda/cublas/, cuBLAS} library is an
implementation of BLAS (Basic Linear Algebra Subprograms) on top of the Nvidia
The @uref{http://docs.nvidia.com/cuda/cublas/, cuBLAS} library is an
implementation of BLAS (Basic Linear Algebra Subprograms) on top of the Nvidia
CUDA runtime.
cuBLAS is normaly distributed with Nvidia CUDA Toolkit.
CUDA/cuBLAS can be enabled in CHAMELEON only if the runtime system chosen
CUDA/cuBLAS can be enabled in CHAMELEON only if the runtime system chosen
is StarPU (default).
To use CUDA through StarPU, it is necessary to compile StarPU with CUDA
To use CUDA through StarPU, it is necessary to compile StarPU with CUDA
enabled.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
CUDA releases from versions 4 to 6.
@strong{Caution about the compatibility:} CHAMELEON has been mainly tested with
CUDA releases from versions 4 to 6.
MAGMA library must be compatible with CUDA.
@node MAGMA
@subsubsection MAGMA
@uref{http://icl.cs.utk.edu/magma/, MAGMA} project aims to develop a dense
linear algebra library similar to LAPACK but for heterogeneous/hybrid
@uref{http://icl.cs.utk.edu/magma/, MAGMA} project aims to develop a dense