Negative count for MPI message size
When generating large fields with few MPI process, the number of elements (int) sent during MPI communication exceeds the largest value representable with an int and make the code crash. It append at two different moment in the code with N=1024**3:
-
nb_proc = 2
Fatal error in PMPI_Sendrecv: Invalid count, error stack: PMPI_Sendrecv(236): MPI_Sendrecv(sbuf=0x7f2ac64f9010, scount=-12570628, MPI_DOUBLE, dest=1, stag=1, rbuf=0x7f3aba528ff0, rcount=-12570628, MPI_DOUBLE, src=1, rtag=2, comm=0x84000006, status=0x7ffc46e8b630) failedPMPI_Sendrecv(111): Negative count, value is -12570628
The error appens in fftw.
-
nb_proc = 4
PMPI_Irecv(156): MPI_Irecv(buf=0x7f6ca32fd010, count=-2147483648, MPI_BYTE, src=0, tag=4, comm=0x84000000, request=0xd63bf0) failed PMPI_Irecv(98).: Negative count, value is -2147483648
The error appens in build_champ.
While such situations should be rare and easily avoided by increasing the number of processes, it is a limitation that cannot be circumvented since custom datatypes can't be used inside the fftw library.