Binding warning on non-exclusive Slurm allocations
With Slurm, when launching 2 MPI processes on 2 nodes (so one per node), with nodes featuring 2 sockets and with a non-exclusive reservation, there is a warning about failed binding and the execution is extremely slow:
salloc -N 2 -C bora -t 5:00
mpirun nm_bench_sendrecv
# bora001.plafrim.cluster: WARNING- pioman: piom_ltask_pthread_init hwloc_set_thread_cpubind failed; rc = -1; errno = 22 (Invalid argument).
# bora002.plafrim.cluster: WARNING- pioman: piom_ltask_pthread_init hwloc_set_thread_cpubind failed; rc = -1; errno = 22 (Invalid argument).
My first investigation shows it's the binding of the progression thread on the second package (idle worker #1 on Package #1 (granularity = 5 usec.)
which is not possible, since the Slurm allocation restricts the allocation to only the first core of the first socket.