Diagonal copy support
All data descriptor for temporary copies of the diagonal to release dependencies on lower/upper parts should be moved to the driver level to avoid synchronization steps when possible. This is already done in the new HQR kernels but should be done in:
-
pzgelqf.c -
pzgelqfrh.c -
pzgeqrf.c -
pzgeqrfrh.c -
pzhetrd_he2hb.c -
pztpgqrt.c -
pzunglq.c -
pzunglqrh.c -
pzungqr.c -
pzungqrrh.c -
pzunmlq.c -
pzunmlqrh.c -
pzunmqr.c -
pzunmqrrh.c