Bugfix - Load imbalance with QR/LQ algorithms
-
Fix the data distribution of the D matrix used by GPU kernels, and StarPU to break down the anti-dependency between the upper and lower part of the diagonal tiles in QR/LQ algorithms. This D matrix was stored only on the process 0 and was creating memory and computation imbalance.
-
Fix the workspace sizes with StarPU which was 4 times larger than expected.
Merge request reports
Activity
Filter activity
changed milestone to %Chameleon 1.0.0
mentioned in commit f8e9aa57
Please register or sign in to reply