Mentions légales du service

Skip to content
Snippets Groups Projects
Thierry Martinez's avatar
Fix #1: Update documentation after terraform MR 1 changes
Thierry Martinez authored
terraform project now use destruct provisioner to unregister gitlab
runner too.
terraform!1
95cf69e8
History

Dynamic pool of virtual machines hosted on ci.inria.fr using Terraform

Why should we use a dynamic pool of virtual machines?

GitLab shared runners allow pipelines to be executed on virtual machines deployed on the fly. However, some jobs may require virtual machines with specific needs that are not covered by shared runners: Windows or Mac OS, more CPUs, more memory, more disk space, large data set, etc. The cloud of virtual machines provided by ci.inria.fr covers these needs by providing large resources for customizable virtual machines.

However, these virtual machines consume resources and electric power all the day if there are on, even if they are used only some minutes each time a commit is performed. A good practice is to prepare virtual machine templates instead of keeping running virtual machines, and to instantiate the virtual machines only when they are used. Doing that consumes less energy, frees more resources for other users of the platform, and even allows the project to deploy reasonably more resources, but only when they are needed.

Prerequisites

This project has the same prerequisites as those listed for the terraform project.

The Terraform configuration file main.tf

The Terraform configuration file main.tf is similar to the configuration file described for the `terraform project.

There is one additional variable: runner_count, of type number .

variable "runner_count" {
  type = number
}

The variable runner_count has two purposes:

  • It allows to deploy a virtual machine conditionally. Indeed, one can pass -var runner_count 0 to terraform plan in order to destroy the virtual machine(s).
  • It allows to deploy many virtual machines if needed. For instance, this example deploys 3 copies of the template virtual machine, to run three jobs in parallel. It is worth noticing that even if you don't need many copies of a virtual machine (either because you need only one virtual machine, or because you need virtual machines with different templates or characteristics), such a variable runner_count is still useful to pass either 1 or 0, depending upon whether the virtual machines should be deployed or destroyed.

The virtual machines themselves are specified below.

resource "cloudstack_instance" "runner" {
  count            = var.runner_count
  name             = "gitlabcigallery-terraform-runner-${count.index}"
  service_offering = "Custom"
  template         = "ubuntu-20.04-lts"
  zone             = "zone-ci"
  details = {
    cpuNumber = 1
    memory    = 1024
  }
  expunge   = true
  user_data = templatefile("cloud-init.sh.tftpl", {
    index              = count.index
    REGISTRATION_TOKEN = var.REGISTRATION_TOKEN
    SSH_PUBLIC_KEY     = var.SSH_PUBLIC_KEY
  })
  connection {
    type                = "ssh"
    host                = self.name
    user                = "ci"
    private_key         = file("id_rsa")
    bastion_host        = "ci-ssh.inria.fr"
    bastion_user        = "gter001"
    bastion_private_key = file("id_rsa")
  }
  provisioner "remote-exec" {
    when   = destroy
    inline = ["sudo gitlab-runner unregister --all-runners || true"]
  }
}

In comparison to the terraform project, the additional property count specifies the number of virtual machines to be deployed (set by the input variable runner_count). We then use the index of the virtual machine available through count.index (which will be between 0 and count-1) for suffixing the name so that each virtual machine is named uniquely, and we pass the index to the template file so that each runner can be registered with a different tag runner-${index} by the script cloud-init.sh.tftpl.

The pipeline specification file .gitlab-ci.yml

In comparison to the [terraform(https://gitlab.inria.fr/gitlabci_gallery/orchestration/terraform#the-pipeline-specification-file-gitlab-ciyml) project, we suppress build stage: the plan and the deployment are performed in the same deploy phase, which is no longer manual. Indeed, contrary to the terraform project, the deployment of the infrastructure is now necessarily linked to the subsequent execution of the pipeline on this infrastructure, because this infrastructure will be deployed only during this pipeline and will be destroyed at the end. There is an additional cleanup stage that destroys the runner at the end of the pipeline. The cleanup job has the property when: always, so that it is executed even when previous jobs fail.

The stages are then as follows.

stages:
  - validate
  - deploy
  - execute
  - cleanup

Every job that will use the Terraform configuration file needs to copy the file referred by SSH_PRIVATE_KEY into the file id_rsa. To copy the file in the validate job without overriding all the script, we use the before_script key.

validate:
  tags:
    - linux
    - small
  extends: .terraform:validate
  before_script:
    - cp $SSH_PRIVATE_KEY id_rsa

The deploy phase begins by deploying 0 runners (i.e., it destroys all possibly existing runners): usually, no runners should have been deployed before, so this should be normally a no-op, but this allows us to clean the environment in the case the cleaning phase of previous pipelines has failed. Then, new runners are deployed, 3 in this example.

deploy:
  stage: deploy
  tags:
    - linux
    - small
  script:
    - cp $SSH_PRIVATE_KEY id_rsa
    - gitlab-terraform plan -var runner_count=0
    - gitlab-terraform apply
    - gitlab-terraform plan -var runner_count=3
    - gitlab-terraform apply

The execute phase uses a matrix to run jobs in parallel on these three runners, by specifying the runner-$index tag to distinguish them.

execute:
  stage: execute
  image: alpine
  parallel:
    matrix:
      - index: [0, 1, 2]
  tags:
    - terraform
    - docker
    - runner-$index
  script:
    - echo Greetings from runner $index!

There is an additional cleanup job that is always executed (even if previous jobs failed) and destroys all the runners by assigning runner_count=0.

cleanup:
  stage: cleanup
  tags:
    - linux
    - small
  script:
    - cd "${TF_ROOT}"
    - cp $SSH_PRIVATE_KEY id_rsa
    - gitlab-terraform plan -var runner_count=0
    - gitlab-terraform apply
  when: always