Try MPI client/server communication
-
Determine implementation support -
✅ MPICH -
✅ Intel MPI -
✅ ParaStation MPI -
❌ OpenMPI
-
-
Test if mpi4py works with MPI connect -
server-side -
client-side
-
-
MPI error handlers are associated with an MPI communicator. Determine which communicator is associated with the error handler invoked by... -
MPI_Comm_accept()
, [the error handler of the MPI communicator passed to it] -
MPI_Comm_connect()
. [the error handler of the MPI communicator passed to it] - Errors forced with empty port names.
-
-
Measure throughput -
JUWELS (most important) -
Jean Zay [ ] Grid 5000 (least important)
-
-
Test MPI connect error handling -
Test behavior if there is no server - all client processes busy-wait
👎
- all client processes busy-wait
-
Test behavior if the server called MPI_Open_port()
but notMPI_Comm_accept()
- all client processes busy-wait
👎
- all client processes busy-wait
-
Test behavior if server dies - client detects exited server processes when calling
MPI_Intercomm_merge
(server process terminated beforeMPI_Intercomm_merge
call) - communication may or may not continue uninterrupted depending on the server processes that exit and the number of server processes that terminate prematurely (server process terminated after
MPI_Intercomm_merge
call)
- client detects exited server processes when calling
-
Test behavior if client(s) die(s) - server crash (segmentation violation) with MPICH 3.3 on Debian 10 if all client processes exit after calling
MPI_Intercomm_merge
- client terminates if any process terminates (client process terminated before
MPI_Intercomm_merge
call) - server detects exited client processess when calling
MPI_Intercomm_merge
(client process terminated beforeMPI_Intercomm_merge
call) - communication may or may not continue uninterrupted depending on the client processes that exit and the number of client processes that terminate prematurely (client process terminated after
MPI_Intercomm_merge
call)
- server crash (segmentation violation) with MPICH 3.3 on Debian 10 if all client processes exit after calling
-
Test program:
- User passes number of clients, number of processes for each client
- Launcher starts server
- Launcher waits for server message with hostname, port (for TCP) or MPI port name
- Launcher starts clients
- Server, clients communicate
- No fault tolerance, experiment aborted on error
Edited by Christoph Conrads