Chameleon issueshttps://gitlab.inria.fr/solverstack/chameleon/-/issues2017-07-04T11:22:53+02:00https://gitlab.inria.fr/solverstack/chameleon/-/issues/33Diagonal copy support2017-07-04T11:22:53+02:00Mathieu FavergeDiagonal copy supportAll data descriptor for temporary copies of the diagonal to release dependencies on lower/upper parts should be moved to the driver level to avoid synchronization steps when possible. This is already done in the new HQR kernels but shoul...All data descriptor for temporary copies of the diagonal to release dependencies on lower/upper parts should be moved to the driver level to avoid synchronization steps when possible. This is already done in the new HQR kernels but should be done in:
* [x] pzgelqf.c
* [x] pzgelqfrh.c
* [x] pzgeqrf.c
* [x] pzgeqrfrh.c
* [x] pzhetrd_he2hb.c
* [x] pztpgqrt.c
* [x] pzunglq.c
* [x] pzunglqrh.c
* [x] pzungqr.c
* [x] pzungqrrh.c
* [x] pzunmlq.c
* [x] pzunmlqrh.c
* [x] pzunmqr.c
* [x] pzunmqrrh.cChameleon 1.0.0BOUCHERIE RaphaelBOUCHERIE Raphaelhttps://gitlab.inria.fr/solverstack/chameleon/-/issues/87Default number of threads is 1 in new testing2020-01-10T16:03:52+01:00Philippe SWARTVAGHERDefault number of threads is 1 in new testingWhen running `mpirun -n 2 -nodelist jack0,jack1 -DSTARPU_FXT_TRACE=1 -DSTARPU_FXT_PREFIX=$(pwd)/ ~/chameleon/build/new-testing/snew-testing -o potrf -H`, I get the following output:
```
# jack0: WARNING- InfinibandVerbs: device = mlx4_0;...When running `mpirun -n 2 -nodelist jack0,jack1 -DSTARPU_FXT_TRACE=1 -DSTARPU_FXT_PREFIX=$(pwd)/ ~/chameleon/build/new-testing/snew-testing -o potrf -H`, I get the following output:
```
# jack0: WARNING- InfinibandVerbs: device = mlx4_0; port 2 is not active.
# jack1: WARNING- InfinibandVerbs: device = mlx4_0; port 2 is not active.
[starpu][starpu_initialize] Warning: StarPU was configured with --with-fxt, which slows down a bit, limits scalability and makes worker initialization sequential
[starpu][starpu_initialize] Warning: StarPU was configured with --with-fxt, which slows down a bit, limits scalability and makes worker initialization sequential
# pioman: WARNING- Ignoring call to piom_ltask_set_bound_thread_indexes as PIOM_DEDICATED_WAIT=0.
Id Function threads gpus P Q nb uplo n lda seedA time gflops
# pioman: WARNING- Ignoring call to piom_ltask_set_bound_thread_indexes as PIOM_DEDICATED_WAIT=0.
0 spotrf 1 0 1 2 320 Upper 1000 1000 1804289383 1.952684e-02 1.709614e+01
Connection to jack1 closed.
```
The default number of used threads is 1. In previous timing binaries, it was by default the number of workers allowed by StarPU. If I precise `-t [x]`, the provided number of threads is well used.
Is it a bug ?Chameleon 1.0.0Mathieu FavergeMathieu Faverge