connect_to_launcher() MPI Barrier issues when reader ranks > 2
Bug: MPI Barrier()
when connecting to the launcher while initializing connections. This came after I have been using guix profile. May require some openmpi updates.
2024-05-24 16:39:49,156:melissa.server.main:ERROR Server failed with msg MPI_ERR_OTHER: known error not in list.
Traceback (most recent call last):
File "/home/apurandare/MELISSA/melissa/server/main.py", line 100, in main
myserver.initialize_connections()
File "/home/apurandare/MELISSA/melissa/server/base_server.py", line 196, in initialize_connections
self.connect_to_launcher()
File "/home/apurandare/MELISSA/melissa/server/base_server.py", line 236, in connect_to_launcher
self.comm.Barrier()
File "mpi4py/MPI/Comm.pyx", line 675, in mpi4py.MPI.Comm.Barrier
mpi4py.MPI.Exception: MPI_ERR_OTHER: known error not in list
This issue is coming from OpenMPI 4.1.6 but only on the Infiniband node.
Edited by PURANDARE Abhishek