Mentions légales du service

Skip to content

ENH: improve the performance of the OpenCL transpose dot

The matrix multiplication of the OpenCL implementation of the transposed linear operator has a poor performance compared to the regular linear operator.

This commit improves the performance of the transposed dot by using OpenCL vector types in the kernel. This improves the performance by a factor of 2+. It also adds a restriction that the generator length must now be a factor of 4. The linear operator tests were modified to adapt to this new restriction.

Merge request reports