Add possibility to use z/cgemm3m for complex mat-mat products (747c7935) · Commits · AGULLO Emmanuel / Chameleon

Commit 747c7935 authored 8 years ago by Guillaume Sylvand

Add possibility to use z/cgemm3m for complex mat-mat products

This routine, available in MKL, does a product in 6n^3 ops instead of 8n^3
but is interesting only for "large enough" matrices (to be tested...)
Potentially, we gain 25 % in all complex computations.
It could be interesting to look for it / implement it in cuda.

!!! Note that the flop counters are not updated         !!!
!!! In C/Z accuracy, most flops counter should be x0.75 !!!

IT is OFF by default
It is activated with MORSE_Enable(MORSE_GEMM3M)
In the timing routines, it is activated with --gemm3m

parent 92a3c4a1

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 37 additions and 0 deletions

Please register or to comment

Admin message

Admin message

Add possibility to use z/cgemm3m for complex mat-mat products