docker: how to control a big number of docker containers (related to doc/advanced_docker.py)
In the example we illustrate how we can use EnOSlib to target remote containers (instead of bare-metal hosts or VMs for instance).
This relies on the docker
connection plugin by issuing docker commands through an SSH connection to the remote hosts. For instance to run the date
raw command on the remote container mydocker-0
running on paravance-8.rennes.grid5000.fr
from the local machine the following local docker command is ran: docker -H ssh://root@paravance-8.rennes.grid5000.fr exec mydocker-0 "date"
However using this plugin raises some caveats due to how the connection to remote containers is handled:
- we can hit #147 (closed) because the ssh connection aren't multiplexed
- we can hit a limit on the remote docker daemon not accepting too many simultaneous connections The above will happen when increasing the number of containers per node and/or the number of nodes.
In the past we explored different ways of working around these problems
- having an
ansible_retries
parameter that will retry the remote actions on failed hosts only (but this has been removed) - to a lesser extend having access to the remote API directly instead of tunneling through SSH (not sure this will ensure a robust execution for 1. because we'll still need to tunnel the TCP connection when running from outside, and 2. might also be hit).
Fixing the above might be very beneficial for users that needs to control a big amount of docker containers on a testbed.