Add options `--forcegpu` and `--profile`
This PR:
- Restore the
--forcegpu
option to enforce kernels to run on the GPU whenever a CUDA implementation is available. - Restore the
--profile
option to display the performance profile of each kernel and their distribution among the resources. - Add a cmake option
CHAMELEON_SIMULATION_EXTENDED
to enable non GPU kernels to be easily simulated on the GPU through simgrid.
Merge request reports
Activity
changed milestone to %Chameleon 1.1.0
added 1 commit
- 53172b4a - Fix compilation with parsec and openmp for forcegpu and profile options
@all Ready for review. This PR may be of interest to all users using SimGrid as it allows to easily simulate any kernel on the GPU by applying similar changes to the codelet as the one applied to POTRF in this PR:
-CODELETS_CPU(zpotrf, cl_zpotrf_cpu_func) +#if defined(CHAMELEON_SIMULATION) && defined(CHAMELEON_SIMULATION_EXTENDED) +CODELETS( zpotrf, cl_zpotrf_cpu_func, cl_zpotrf_cuda_func, STARPU_CUDA_ASYNC ) +#else +CODELETS_CPU( zpotrf, cl_zpotrf_cpu_func ) +#endif
- Resolved by AGULLO Emmanuel
Thanks much @faverge . This is a great feature and as always extremely well integrated. I am just not sure what should happen if gpus are forced but no gpu is there. It may just be perfect like it is.
added 11 commits
-
53172b4a...338fc6be - 7 commits from branch
solverstack:master
- a9fe9f7a - Add simulation extended option to allow non GPU kernels to be simulated on the GPU
- 92ec8c6b - Separate kernel profile from trace, and add profile option to display information
- 3bad747f - Add the forcegpu option to enforce possible kernels on the GPU
- b8a1a8ef - Fix compilation with parsec and openmp for forcegpu and profile options
Toggle commit list-
53172b4a...338fc6be - 7 commits from branch
added 5 commits
-
17b86bc4 - 1 commit from branch
solverstack:master
- 3450f821 - Add simulation extended option to allow non GPU kernels to be simulated on the GPU
- 2bf49014 - Separate kernel profile from trace, and add profile option to display information
- 15d3d3c7 - Add the forcegpu option to enforce possible kernels on the GPU
- 3943fb90 - Fix compilation with parsec and openmp for forcegpu and profile options
Toggle commit list-
17b86bc4 - 1 commit from branch
enabled an automatic merge when the pipeline for 3943fb90 succeeds
mentioned in commit 775728f3