all mpi programs need some sort of mpiexec
See e.g.
https://cado-nfs-ci.loria.fr/ci/job/future-lingen/job/mpi-plafrim-current-amd64/109/console
We have a problem when we try to test MPI functionality of cado-nfs. This builds binaries that link to a real MPI library (as opposed to the almost-semantically-correct fakempi.h
header). But this has the consequences that the binaries must be run via a dedicated launcher, even in single-node contexts. Mileage varies:
- In some cases, it is fine to run the binary as is, and it will be equivalent to mpiexec -n 1
- In some cases, an mpiexec is mandatory (see example above)
- In some cases, mpiexec does not work either, since process launch must be done via the job scheduler (srun), using the PMI/PMIx layer that is common to the job scheduler and the MPI runtime.
How to deal with that in cado-nfs is not entirely clear, but must be done so that we're able to do routine checks of the MPI functionalities.
Presently we have a big mess. The situation below refers to the future-lingen branch (see !1 (merged))
- bwc programs, when run via
bwc.pl
, seem to always have the proper mpiexec precommand. The srun-only cases, although supported in some ways, is not detected automatically. - programs run by
make test
(should) go throughtests/do_with_mpi.sh
andtests/guess_mpi_configs.sh
, which are partly redundant with what bwc.pl does. - programs run by cado-nfs.py (e.g.,
sm
) have no mpiexec prefix at all, and are currently run standalone.
Temporarily, I think that disabling all full_*
tests for MPI builds is among the things to do. (done in 1d81735c)
Further action might include creating a common runner for all program, perhaps.