Should we keep using ZMQ?
I feel that ZMQ has more drawbacks than benefits for Batsim and that we should consider not using it anymore. What do you think?
Pros of uzing ZMQ
- Handles language-agnostic network messages.
This is easy to implement ourselves by prefixing messages with a size encoded in a well defined endianness. - Handles lower level communication protocols. This enables the transparent use of TCP or unix sockets.
Cons of uzing ZMQ
- Adds a dependency to Batsim and to all projects that communicate with Batsim.
This dependency is a binding for non-C/C++ projects, which is quite annoying as language-specific packages do not work properly with bindings (e.g.,pybatsim
is bundled withpyzmq
on PyPI but withoutlibzmq.so
, which may result indlopen
-like errors for end users) - ZMQ tries hard to handle connection loss (either from servers or clients in REQ/REP). We do not want this at all for Batsim, we'd rather stop the simulation when either Batsim or the scheduler is lost. AFAIK disabling this feature is not possible and this has several impacts.
- We need to make sure connections do not already exist before starting a simulation to avoid hindering simulations that are already running. This is currently done by robin but this lacks robustness, especially when several simulations are to be launched on the same machine.
- We need some black magic to detect when a connection is lost. If one of the two processes fail, robin currently handles the situation by killing the other process. In case of infinite loop nothing is done from Batsim nor most scheduler implementations, but pybatsim has a timeout mechanism that can be detrimental for the simulation (if for whatever reason SimGrid takes a huge time to simulate a simulation step, the simulation will stop).
- ZMQ uses a thread, which makes more complex performance analysis of the whole simulation. This is not so important for optimized code as we plan to use scheduler libraries for them (thus without ZMQ), but pybatsim will remain annoying to analyze because of ZMQ.