Automatic script wrapper (with systemd)
The following situation isn't uncommon:
Experimenter has long-running program a.out
intended to run as part of her experimentation. This program must be deployed on 1 to n machines and run in background. Until now she can call the shell
module (using api.run
or an action block).
The caveats from the experimenter's perspective is that:
- she must make sure to correctly daemonize the process
- she must make sure to correctly record stdout/stderr of the running process (e.g for post-mortem analysis)
- she must be able to check the program status quickly (and start/stop it )
There might be also secondary requirements about:
- the restart strategy
- throttling the program
- ...
As usual there is a full spectrum of solutions to do that:
- ansible async task / tmux / screen can be used to daemonize the process (or code
a.out
so that it can daemonized itself) - shell redirection can be used to capture the outputs on specifics files
- docker can be used to get an uniform API to interact with the program lifecycle
Recently I came across an alternative using systemd unit files: https://gitlab.inria.fr/discovery/enoslib/-/blob/a49a485d27b081fc4988e5b340ebecaf2cf5dc7f/docs/tutorials/multi_providers/artifacts_cloud/edge_cloud.service
Systemd solves the above caveats and is most of the time available on the targetted machines.
Note also that there's many features we can inherit from systemd (throttling, timer, output redirection, restart strategy ... ). For these reasons systemd a good candidate.
We could think of providing a default systemd unit file template that can be deployed on the remote machines transparently.
I see two options/steps:
- Implement a systemd_generator as an Ansible Module (maybe it already exits ?):
User would be able to do:
with en.action(roles=roles) as a: a.copy(src=a.out, dest=/opt/my_program/a.out) a.systemd_generator(ExecStart="/opt/my_program/a.out", Restart="Always" ...) # not sure how to name the parameters.
- Implement this as a EnOSlib service ( which can use the above internally)
systemd_gen = en.SystemdGenerator(machines=machines, program="/opt/my_program/a.out", ...) systemd_gen.deploy()