Switch to a Magnet-hosted CI runner
This would switch our default runner to the magnet server one, currently running in my own space.
Still todo :
-
Move runner to shared location on magnet server -
Tensorflow does not detect GPU, probably due to a CUDA/tf version mismatch. > requires Paul to push his final version -
Remove docker in docker -
Caching -
Configure caching on the shared runnersm, if possible -
On magnet, the pip caching does not work as intented
-
-
We might want to refine our runner policy : - This will switch to all on magnet
- We could split between shared and magnet
- make cache persistent on shared runner (here) and use it to run routine jobs
- Use magnet runner for GPU and other python version tests
To set up a runner from scratch
- Create a docker container with access to GPU
# Create a volume
docker volume create gitlab-runner-config
# Create and start a container from the official gitlab runner image
docker run -d --name tester --gpus device=0 --restart always \
-v /var/run/docker.sock:/var/run/docker.sock \
-v gitlab-runner-config:/etc/gitlab-runner \
gitlab/gitlab-runner:latest
- Register the runner on gitlab
docker run --rm -v gitlab-runner-config:/etc/gitlab-runner \
gitlab/gitlab-runner register \
--non-interactive \
--url "https://gitlab.inria.fr" \
--registration-token "< YOUR REPO TOKEN >" \
--description "Magnet5 CI runner with access to GPU" \
--executor "docker" \
--tag-list "magnet,gpu"
--docker-image python3.8
- Change the
config.toml
file of the runner to give it access to GPU
Find your container ID by running docker ps -a
then get inside its CLI
docker exec -u 0 -it <CONTAINER_ID> /bin/bash
Once in, update apt-get and install nano
apt-get update
apt-get install nano
Access the config file
nano /etc/gitlab-runner/config.toml
Under [runners.docker]
, add gpus = "all"
. Example final file :
[[runners]]
name = "Test magnet5 CI runner with access to GPU"
url = "https://gitlab.inria.fr"
id = 1234
token = "XXXX"
token_obtained_at = 2023-05-04T09:30:40Z
token_expires_at = 0001-01-01T00:00:00Z
executor = "docker"
[runners.cache]
MaxUploadedArchiveSize = 0
[runners.docker]
tls_verify = false
image = "python3.8"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
gpus = "all"
disable_cache = false
volumes = ["/cache"]
shm_size = 0