Chameleon issueshttps://gitlab.inria.fr/solverstack/chameleon/-/issues2023-07-03T16:00:22+02:00https://gitlab.inria.fr/solverstack/chameleon/-/issues/72Would need matrix name for tracing, benchmarking, etc. tools2023-07-03T16:00:22+02:00THIBAULT Samuelsamuel.thibault@inria.frWould need matrix name for tracing, benchmarking, etc. toolsHello,
When using tracing, benchmarking, etc. tools with StarPU, we have the tile coordinates thanks to the call to starpu_data_set_coordinates . We would however need to also call starpu_data_set_name to provide a name for the matrix. ...Hello,
When using tracing, benchmarking, etc. tools with StarPU, we have the tile coordinates thanks to the call to starpu_data_set_coordinates . We would however need to also call starpu_data_set_name to provide a name for the matrix. For instance with time_zgemm_tile, 3 matrices are used (A,B,C), and if tracing tools only provide the tile coordinates, one don't know whether it's A, B, or C.
AIUI, that'd require to add a string parameter to PASTE_CODE_ALLOCATE_MATRIX_TILE (or perhaps automatically stringify the descA parameter?), to pass it as a new parameter to CHAMELEON_Desc_Create_* and to chameleon_desc_init. Or do you prefer to introduce another function, that PASTE_CODE_ALLOCATE_MATRIX_TILE would call after Desc_Create ?
SamuelChameleon 1.3.0https://gitlab.inria.fr/solverstack/chameleon/-/issues/39Adding gflop for each task2023-05-30T13:30:44+02:00THIBAULT Samuelsamuel.thibault@inria.frAdding gflop for each taskIt'd be useful to add the amount of GFlop for each StarPU codelet, this way:
starpu_task_insert(..., STARPU_FLOPS, MULS(nb) + ADDS(nb), ...);It'd be useful to add the amount of GFlop for each StarPU codelet, this way:
starpu_task_insert(..., STARPU_FLOPS, MULS(nb) + ADDS(nb), ...);Chameleon 1.3.0https://gitlab.inria.fr/solverstack/chameleon/-/issues/114Worker parallel do not work with lws2023-01-12T12:10:53+01:00Mathieu FavergeWorker parallel do not work with lwsBe careful in !206
For now, the default scheduler when no GPUs are involved is lws which does not work with the multi-threaded kernel.Be careful in !206
For now, the default scheduler when no GPUs are involved is lws which does not work with the multi-threaded kernel.