Mentions légales du service

Skip to content
Snippets Groups Projects
Name Last commit Last update
test
.gitlab-ci.yml
README.md

Jobs on a Supercomputer with Slurm

Plafrim supercomputer allows users to get a specific account to run their gitlab-ci jobs. Explanation here: https://plafrim-users.gitlabpages.inria.fr/doc/#gitlab-ci.

Runner installation on the supercomputer

Let's consider we got a Plafrim account "gitlab-gitlabci-gallery" specific for this project. First register the runner to be used on the supercomputer

ssh gitlab-gitlabci-gallery@plafrim

# gitlab-runner executable is already installed on plafrim
module add tools/gitlab-runner/14.7.0

# register the runner
gitlab-runner register
# register your specific runner with the appropriate information, see https://docs.gitlab.com/runner/register/#linux
# Example:
# instance URL: https://gitlab.inria.fr/,
# registration token: GR13489413XJvSphSc7fb2N2pgt4y,
# description: devel01.plafrim.cluster,
# tags: plafrim,
# executor: shell

Setup the URL and the token found in the gitlab web interface (Settings -> CI/CD -> Runners -> Specific runners -> Set up a specific runner manually). Setup tags such: the project name, guix, plafrim, shell, etc. Set shell as executor.

Increase the default number of jobs which can run concurently, edit the file ~/.gitlab-runner/config.toml and change the value of concurrent, e.g. concurrent = 10.

Then launch gitlab-runner in user mode to allow your runner waiting for new jobs triggered by Gitlab

ssh gitlab-gitlabci-gallery@plafrim
tmux
module add tools/gitlab-runner/14.7.0 tools/git/2.36.0 tools/gitlab-ci
gitlab-runner run &
# or use the available script on plafrim: gitlab-runner-keep-alive
# detach from the tmux shell: ctrl+b, d
# you can re-attach to it with: tmux attach

The runner should appear in your Gitlab's project in Settings -> CI/CD -> Runners -> Available specific runners.

Source code

The gitlab-ci jobs are defined in .gitlab-ci.yml, see the results on the CI/CD -> Pipelines page (remember to enable the CI/CD feature in Settings -> General -> Visibility, project features, permissions).

Two jobs are defined with a parallel matrix, see this example:

  • one using salloc,
  • and another one using sbatch.

sbatch job submission is asynchronous, in the sense that it returns immediately without waiting for the job completion, see this discussion.

The two jobs perform the same thing, a "pingpong" test from the Intel MPI benchmarks package, see the command mpiexec IMB-MPI1 PingPong.

The pipeline is triggered following the rule schedule so that it is not executed each time a branch is updated but only once a day at a fixed time. Notice that the date, time, repetition can be configured differently see cron. It can also be launched manually if your clic on the Play button in the schedule panel. The Slurm's job queue may be busy and the job can take time to start. Hence, we use a timeout of 24h for the gitlab-ci job since it is triggered every 24 hours.

The kind of node (i.e. here a slurm parameter see the --constraint flag) to use is choosen thanks to a CI/CD variable, arbitrarily named CONS, defined in the schedule panel (default is "bora").

Notice the software environment is GNU Guix but one can install programs manually in the "gitlab-gitlabci-gallery" account home directory.