Dense linear algebra subroutines for heterogeneous and distributed architectures

Name Last Update
cmake_modules Loading commit data...
compute Loading commit data...
control Loading commit data...
coreblas Loading commit data...
cudablas Loading commit data...
distrib Loading commit data...
docs Loading commit data...
example Loading commit data...
hqr @ 2bc36789
include Loading commit data...
lib/pkgconfig Loading commit data...
plasma-conversion Loading commit data...
runtime Loading commit data...
simucore Loading commit data...
testing Loading commit data...
timing Loading commit data...
.dir-locals.el Loading commit data...
.gitignore Loading commit data...
.gitlab-ci.yml Loading commit data...
.gitmodules Loading commit data... Loading commit data...
CMakeLists.txt Loading commit data... Loading commit data...
CTestConfig.cmake Loading commit data...
ChangeLog Loading commit data...
INSTALL.txt Loading commit data...
LICENCE.txt Loading commit data... Loading commit data... Loading commit data...

Chameleon: A dense linear algebra software for heterogeneous architectures

Chameleon is a C library providing parallel algorithms to perform BLAS/LAPACK operations exploiting fully modern architectures.

Chameleon dense linear algebra software relies on sequential task-based algorithms where sub-tasks of the overall algorithms are submitted to a Runtime system. Such a system is a layer between the application and the hardware which handles the scheduling and the effective execution of tasks on the processing units. A Runtime system such as StarPU is able to manage automatically data transfers between not shared memory area (CPUs-GPUs, distributed nodes).

This kind of implementation paradigm allows to design high performing linear algebra algorithms on very different type of architecture: laptop, many-core nodes, CPUs-GPUs, multiple nodes. For example, Chameleon is able to perform a Cholesky factorization (double-precision) at 80 TFlop/s on a dense matrix of order 400 000 (i.e. 4 min). Chameleon is a sub-project of MORSE specifically dedicated to dense linear algebra.

1 Get Chameleon

To use last development states of Chameleon, please clone the master branch. Note that Chameleon contains a git submodule morse_cmake. To get sources please use these commands:

# if git version >= 1.9
git clone --recursive
cd chameleon
# else
git clone
cd chameleon
git submodule init
git submodule update

Last releases of Chameleon are hosted on the for now. Future releases will be available on this gitlab project.

2 Documentation

There is no up-to-date documentation of Chameleon. We would like to provide a doxygen documentation hosted on gitlab in the future. Please refer to the section 2.1 of READMEDEV to get information about the documentation generation.

The documentation of Chameleon’s last release is available here: chameleon-0.9.1 documentation.

2.1 For developers


3 Installation

3.1 Distribution of Chameleon

To get support to install a full distribution (Chameleon + dependencies) we encourage users to use the morse branch of Spack.

Please read these documentations:

3.1.1 Usage example for a simple distribution of Chameleon

git clone
. ./spack/share/spack/
spack install -v chameleon
# chameleon is installed here:
`spack location -i chameleon`

3.2 Build and install with CMake

Chameleon can be built using CMake. This installation requires to have some library dependencies already installed on the system.

Please refer to chameleon-0.9.1 to get configuration information.

4 Get involved!

4.1 Mailing list

To contact the developers send an email to

4.2 Contributions


5 Authors

First, since the Chameleon library started as an extension of the PLASMA library to support multiple runtime systems, all developpers of the PLASMA library are developpers of the Chameleon library.

The following people contributed to the development of Chameleon:

  • Emmanuel Agullo, PI
  • Olivier Aumage
  • Cedric Castagnede
  • Terry Cojean
  • Mathieu Faverge, PI
  • Nathalie Furmento
  • Reazul Hoque
  • Hatem Ltaief
  • Gregoire Pichon
  • Florent Pruvost, PI
  • Marc Sergent
  • Guillaume Sylvand
  • Samuel Thibault
  • Stanimire Tomov
  • Omar Zenati

If we forgot your name, please let us know that we can fix that mistake.

6 Citing Chameleon

Feel free to use the following publications to reference Chameleon:

  • Original paper that initiated Chameleon and the principles:
    • Agullo, Emmanuel and Augonnet, Cédric and Dongarra, Jack and Ltaief, Hatem and Namyst, Raymond and Thibault, Samuel and Tomov, Stanimire, Faster, Cheaper, Better – a Hybridization Methodology to Develop Linear Algebra Software for GPUs, GPU Computing Gems, First Online: 17 December 2010.
  • Design of the QR algorithms:
    • Agullo, Emmanuel and Augonnet, Cédric and Dongarra, Jack and Faverge, Mathieu and Ltaief, Hatem and Thibault, Samuel an Tomov, Stanimire, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators, 25th IEEE International Parallel & Distributed Processing Symposium, First Online: 16 December 2010.
  • Design of the LU algorithms:
    • Agullo, Emmanuel and Augonnet, Cédric and Dongarra, Jack and Faverge, Mathieu and Langou, Julien and Ltaief, Hatem and Tomov, Stanimire, LU Factorization for Accelerator-based Systems, 9th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 11), First Online: 21 December 2011.
  • Regarding distributed memory:
    • Agullo, Emmanuel and Aumage, Olivier and Faverge, Mathieu and Furmento, Nathalie and Pruvost, Florent and Sergent, Marc and Thibault, Samuel, Achieving High Performance on Supercomputers with a Sequential Task-based Programming Model, Research Report, First Online: 16 June 2016.

7 Licence