1. 09 Mar, 2017 1 commit
  2. 06 Mar, 2017 1 commit
  3. 23 Dec, 2016 1 commit
  4. 09 Dec, 2016 1 commit
  5. 01 Dec, 2016 1 commit
    • PRUVOST Florent's avatar
      Re-work the cmake for simulation mode: · 7255c7c4
      PRUVOST Florent authored
          - use (starpu_cpu_func_t) 1 trick, same as cuda_func
          - cpu funtions are not defined anymore avoiding the dependency to coreblas
          - add #if !defined(CHAMELEON_SIMULATION) where it is needed
          - remove dependency to the coreblas library (become useless)
          - remove useless simucblas, simulapacke libraries
          - remove CHAMELEON_SIMULATION_MAGMA cmake variable and definition
            - keep using CHAMELEON_USE_CUDA and CHAMELEON_USE_MAGMA to consider CUDA kernels
            - this avoid to introduce useless new variables
          - work on messages
      
      7255c7c4
  6. 12 Oct, 2016 2 commits
  7. 20 Sep, 2016 2 commits
    • Guillaume Sylvand's avatar
      Add possibility to use z/cgemm3m for complex mat-mat products · 747c7935
      Guillaume Sylvand authored
      This routine, available in MKL, does a product in 6n^3 ops instead of 8n^3
      but is interesting only for "large enough" matrices (to be tested...)
      Potentially, we gain 25 % in all complex computations.
      It could be interesting to look for it / implement it in cuda.
      
      !!! Note that the flop counters are not updated         !!!
      !!! In C/Z accuracy, most flops counter should be x0.75 !!!
      
      IT is OFF by default
      It is activated with MORSE_Enable(MORSE_GEMM3M)
      In the timing routines, it is activated with --gemm3m
      747c7935
    • Guillaume Sylvand's avatar
      Add a 'progress indicator' feature, that displays a percentage of completion · 92a3c4a1
      Guillaume Sylvand authored
      IT is OFF by default
      It is activated with MORSE_Enable(MORSE_PROGRESS)
      In the timing routines, it is activated with --progress
      No progress is printed for tasks faster than 10 seconds
      92a3c4a1
  8. 09 Sep, 2016 1 commit
  9. 07 Sep, 2016 1 commit
  10. 05 Oct, 2015 1 commit
  11. 29 Sep, 2015 1 commit
  12. 28 Sep, 2015 1 commit
  13. 17 Sep, 2015 1 commit
  14. 16 Sep, 2015 1 commit
    • THIBAULT Samuel's avatar
      Introduce MORSE_Distributed_start, MORSE_Distributed_stop, · 34558c7a
      THIBAULT Samuel authored
      MORSE_Distributed_size, MORSE_Distributed_rank so that applications do not
      hardcode the use of MPI.
      
      Introduce RUNTIME_distributed_rank, RUNTIME_distributed_size,
      RUNTIME_distributed_barrier, so that MORSE does not hardcode the use of MPI
      either.
      
      This allows to use simgrid-mpi.
      34558c7a
  15. 28 Jul, 2015 1 commit
  16. 05 Feb, 2015 2 commits
  17. 19 Nov, 2014 1 commit
  18. 16 Nov, 2014 6 commits