Commit 1fdc9fe3 authored by COJEAN Terry's avatar COJEAN Terry
Browse files

Explain the restriction proposition for getrf_nopiv

parent dcee2671
...@@ -47,6 +47,12 @@ RunTest(int *iparam, double *dparam, morse_time_t *t_) ...@@ -47,6 +47,12 @@ RunTest(int *iparam, double *dparam, morse_time_t *t_)
MORSE_zlacpy_Tile(MorseUpperLower, descA, descAC); MORSE_zlacpy_Tile(MorseUpperLower, descA, descAC);
} }
* Consider this optimization on some heterogenous platforms and matrix sizes.
* Often, TRSM kernel on GPU yields significantly less performance rate than GEMM,
* while performances are similar on CPU. On this algorithm it is therefore
* recommended to execute all TRSMs (~low amount) on CPU to increase GPU efficiency.
//RUNTIME_zlocality_onerestrict( MORSE_TRSM, STARPU_CPU ); //RUNTIME_zlocality_onerestrict( MORSE_TRSM, STARPU_CPU );
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment