Missing CUDA kernels, and fix many warnings
- Add missing herfb and tpmqt CUDA kernels
- Silent all warnings in Debug with gcc 5.4
- Replace max/min macros with static inline functions to avoid warnings about comparing unsigned/signed int and or int sizes.
@sylvand : can you tell me if you still have trouble when compiling with StarPU ? I hope, I removed every probems.
Merge request reports
Activity
mentioned in merge request !4 (merged)
added 10 commits
-
dced9fa4...5639e1b8 - 5 commits from branch
solverstack:master
- a8954d6f - Add missing cuda kernels
- 9e668381 - Cleanup warnings, especially by using a static inline function instead of a macro for min/max
- 18ff643a - Fix remaining min/max
- 6a7a74cc - Fix remaining min/max
- 4b52f1eb - Apply the max/min change to other runtimes and timings
Toggle commit list-
dced9fa4...5639e1b8 - 5 commits from branch
@all, I updated the pull request with Guillaume's changes.
@sylvand names are coming from lapack: - TP -> Triangular on top of Pentagonal, this merges TS and TT (on top of a Square, and on top of Triangle) kernels - qrt -> QR factorization with Tile algorithms - mqrt -> Multiply by QR in Tile algorithm, the T has been added in Lapack to all kernels of the Tile QR, before it was tsmqr, and ttmqr.
Normally, unmqr should be renamed gemqrt to follow Lapack.
I added tpgqrt for Generate Q. This is a specific kernel, that performs a partial Q generation required by the QDWH algorithm developed by Dalal and Hatem.
added Feature label
assigned to @faverge
@all No remarks ? If nobody says anything before tomorrow, I'll merge it into the trunk.
mentioned in commit 0f9d6645