1. 23 Dec, 2016 1 commit
  2. 09 Dec, 2016 1 commit
  3. 12 Oct, 2016 2 commits
  4. 20 Sep, 2016 2 commits
    • Guillaume Sylvand's avatar
      Add possibility to use z/cgemm3m for complex mat-mat products · 747c7935
      Guillaume Sylvand authored
      This routine, available in MKL, does a product in 6n^3 ops instead of 8n^3
      but is interesting only for "large enough" matrices (to be tested...)
      Potentially, we gain 25 % in all complex computations.
      It could be interesting to look for it / implement it in cuda.
      
      !!! Note that the flop counters are not updated         !!!
      !!! In C/Z accuracy, most flops counter should be x0.75 !!!
      
      IT is OFF by default
      It is activated with MORSE_Enable(MORSE_GEMM3M)
      In the timing routines, it is activated with --gemm3m
      747c7935
    • Guillaume Sylvand's avatar
      Add a 'progress indicator' feature, that displays a percentage of completion · 92a3c4a1
      Guillaume Sylvand authored
      IT is OFF by default
      It is activated with MORSE_Enable(MORSE_PROGRESS)
      In the timing routines, it is activated with --progress
      No progress is printed for tasks faster than 10 seconds
      92a3c4a1
  5. 16 Sep, 2015 1 commit
    • THIBAULT Samuel's avatar
      Introduce MORSE_Distributed_start, MORSE_Distributed_stop, · 34558c7a
      THIBAULT Samuel authored
      MORSE_Distributed_size, MORSE_Distributed_rank so that applications do not
      hardcode the use of MPI.
      
      Introduce RUNTIME_distributed_rank, RUNTIME_distributed_size,
      RUNTIME_distributed_barrier, so that MORSE does not hardcode the use of MPI
      either.
      
      This allows to use simgrid-mpi.
      34558c7a
  6. 28 Jul, 2015 1 commit
  7. 19 Nov, 2014 1 commit
  8. 16 Nov, 2014 4 commits