Breed with mpi rank
The image showcases the divergence and synchronization points between the training ranks and the breed rank.
- Split the original communicator such that last rank is removed for training. (Refactoring TBD:
mpirun -n 4 -- ./server.sh : -n 1 ./server.sh
i.e to split based on theMPI_APPNUM
instead of launching extra rank explicitely in the configuration). - Update the communicator to be used for the regular server activities
- Diverge all ranks as shown below
Edited by PURANDARE Abhishek