- Mar 17, 2017
-
-
THIBAULT Samuel authored
-
- Mar 14, 2017
-
-
Mathieu Faverge authored
-
- Dec 24, 2016
-
-
Mathieu Faverge authored
-
- Dec 09, 2016
-
-
Mathieu Faverge authored
-
- Oct 12, 2016
-
-
Guillaume Sylvand authored
-
Guillaume Sylvand authored
timing: add option --bigmat to choose if we allocate one big 'mat' array or if the runtime allocates the tile one by one
-
- Sep 20, 2016
-
-
Guillaume Sylvand authored
This routine, available in MKL, does a product in 6n^3 ops instead of 8n^3 but is interesting only for "large enough" matrices (to be tested...) Potentially, we gain 25 % in all complex computations. It could be interesting to look for it / implement it in cuda. !!! Note that the flop counters are not updated !!! !!! In C/Z accuracy, most flops counter should be x0.75 !!! IT is OFF by default It is activated with MORSE_Enable(MORSE_GEMM3M) In the timing routines, it is activated with --gemm3m
-
Guillaume Sylvand authored
IT is OFF by default It is activated with MORSE_Enable(MORSE_PROGRESS) In the timing routines, it is activated with --progress No progress is printed for tasks faster than 10 seconds
-
- Sep 16, 2015
-
-
THIBAULT Samuel authored
MORSE_Distributed_size, MORSE_Distributed_rank so that applications do not hardcode the use of MPI. Introduce RUNTIME_distributed_rank, RUNTIME_distributed_size, RUNTIME_distributed_barrier, so that MORSE does not hardcode the use of MPI either. This allows to use simgrid-mpi.
-
- Jul 28, 2015
-
-
PRUVOST Florent authored
-
- Nov 19, 2014
-
-
PRUVOST Florent authored
change copyright - correct whitespace - place cmake module depending on chameleon in cmake_modules and no more in cmake_modules/morse
-
- Nov 16, 2014
-
-
PRUVOST Florent authored
-
PRUVOST Florent authored
-
PRUVOST Florent authored
-
PRUVOST Florent authored
-