Introduce half-precision conversion and gemm kernels for GPUs

changed milestone to %Chameleon 1.3.0

added API Feature labels

requested review from @fpruvost

assigned to @faverge

approved this merge request

added 1 commit

marked this merge request as draft

added 9 commits

9f547b58...18143a45 - 3 commits from branch solverstack:master
7cfeb709 - Update cmake
e15cd89c - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
0c056974 - gpucublas: Add zlag2c/dlag2h kernels to convert data precision
5e471daa - gpus: Add the hgemm kernel for cuda and hip
15d1e7e4 - gpu/cublas: add the gemmex kernel
f8542c1f - fembem: update the fembem testing to fix the enum floating type

marked this merge request as ready

added 5 commits

192a8fd9 - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
0e29d4a4 - Add some protection to detect the cuda version
0e20c84c - gpus: Add the hgemm kernel for cuda and hip
3a380689 - gpu/cublas: add the gemmex kernel
5efd80ed - fembem: update the fembem testing to fix the enum floating type

added 2 commits

added 4 commits

69df2508 - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
90624fa5 - gpucublas: Add dlag2h and zlag2c cuda kernels originated from Magma
8c9adb62 - gpus: Add the hgemm kernel for cuda and hip
ab98af76 - gpu/cublas: add the gemmex kernel

@x-YHong I don't know if you seen it, but please try to compile it to at least check if it compiles on your side. Thanks.

added 6 commits

7227e811 - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
8c49e757 - control: store the ncuda field in the options to know at insertion time if...
bb919ef4 - descriptor: protect datatype from mixed flag
3931bfbd - gpucublas: Add dlag2h and zlag2c cuda kernels originated from Magma
08097fd4 - gpus: Add the hgemm kernel for cuda and hip
770ecde3 - gpus/cublas: add the gemmex kernel

resolved all threads

marked this merge request as draft

added 6 commits

eb63946c - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
2d6c4633 - control: store the ncuda field in the options to know at insertion time if...
d20ac9af - descriptor: protect datatype from mixed flag
b3c031f0 - gpucublas: Add dlag2h and zlag2c cuda kernels originated from Magma
108dedc0 - gpus: Add the hgemm kernel for cuda and hip
1a19cf91 - gpus/cublas: add the gemmex kernel

added 15 commits

1a19cf91...aed911e0 - 9 commits from branch solverstack:master
5f2e67a0 - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
b1a2bdf5 - control: store the ncuda field in the options to know at insertion time if...
6967da15 - descriptor: protect datatype from mixed flag
d3dae207 - gpucublas: Add dlag2h and zlag2c cuda kernels originated from Magma
8f59588e - gpus: Add the hgemm kernel for cuda and hip
7acc44e7 - gpus/cublas: add the gemmex kernel

marked this merge request as ready

added 7 commits

3e958439 - 1 commit from branch solverstack:master
af2d2fff - constants: Split the flttype field as a bitmask of datatype, datasize and mixed
e97d6aa9 - control: store the ncuda field in the options to know at insertion time if...
e2eabadf - descriptor: protect datatype from mixed flag
2b1c0111 - gpucublas: Add dlag2h and zlag2c cuda kernels originated from Magma
e3e328c2 - gpus: Add the hgemm kernel for cuda and hip
9b63a0bd - gpus/cublas: add the gemmex kernel

Admin message