Optimizing GPU Task Distribution: Reorganizing Streams and Improving PHEFT Weighting
1. Reorganizing the Order of Streams
To ensure a balanced distribution of tasks between streams on different GPUs, the order of streams in the list of workers has been modified.
-
Example:
- Configuration: 2 GPUs with 4 streams each.
- Stream id for each GPU:
- GPU_1: 0, 1, 2, 3
- GPU_2: 4, 5, 6, 7
-
Workers' order :
- Old: 0, 1, 2, 3, 4, 5, 6, 7
- New: 0, 4, 1, 5, 2, 6, 3, 7
2. New weighting for PHEFT Multi-Streaming
To optimize task management in multi-streaming situations, a new weighting has been introduced into the PHEFT algorithm. This weighting takes into account the number of streams occupied on a GPU in order to order tasks appropriately.
- Weighting formula: if (busy_stream_count != 0) { task_starting_time + (local_task_length[worker_ctx][nimpl] * 1.8 * busy_stream_count) }