Mentions légales du service

Skip to content
Snippets Groups Projects

XKaapi 2

Welcome to the new experimental XKaapi implementation.
This repository is highly experimental and not yet fully compatible with older XKaapi/XKBlas releases located at http://gitlab.inria.fr/xkblas/versions. It is a C++ implementation and interfaces of the XKaapi dataflow and task-based programming model.

Please open any issue at https://github.com/rpereira-dev/xkaapi

Related Projects

This repository hosts the XKaapi runtime system.
Other repository hosts specialization layers built on top of the runtime:

Getting started

Installation

Requirements

Optional

Build command example

See the CMakeLists.txt file for all available options.

# with support for only the host driver and debug modes, typically for developing on local machines with no GPUs
CC=clang CXX=clang++ cmake -DUSE_STATS=on -DCMAKE_BUILD_TYPE=Debug ..

# with support for Cuda
CC=clang CXX=clang++ CMAKE_PREFIX_PATH=$CUDA_PATH:$CMAKE_PREFIX_PATH cmake -DUSE_CUDA=on ..

# with support for Cuda and all optimization
CC=clang CXX=clang++ CMAKE_PREFIX_PATH=$CUDA_PATH:$CMAKE_PREFIX_PATH cmake -DUSE_CUDA=on -DUSE_SHUT_UP=on -DENABLE_HEAVY_DEBUG=off -DCMAKE_BUILD_TYPE=Release ..

Available environment variable

  • XKAAPI_HELP=1 - displays available environment variables.

Directions for improvements

  • Rework interface for distributions - so it is an abstract object that may be passed to various constructs - look at what PGAS do
  • Add explicit/synchronous driver interfaces, to allow users to bypas event/polling/coherency/tasking. Particularly for OpenMP blocking constructs
  • Add a memory coherency controller for 'point' accesses, to retrieve original xkblas/kaapi behavior - that is a decentralized and paged-base coherency protocol
  • Tasks are currently only deleted all-at-once on invalidate calls.
  • Stuff from xkrt-init could be moved for lazier initializations
  • Add support for blas compact symetric matrices
  • Add support for commutative write, maybe with a priority-heap favoring accesses with different heuristics (the most successors, the most volume of data as successors, etc...)
  • Add support for IA/ML devices (most of them only have high-level Python API, only Graphcore seems to have a good C API, but Graphcore seems to be dying)