Mentions légales du service

Skip to content
Snippets Groups Projects

Using Terraform to specify a pool of virtual machines hosted on ci.inria.fr

Why using Terraform to specify the pool of virtual machines used by continuous integration?

Terraform is an open-source software developed by HashiCorp to specify declaratively the state of a cluster of virtual machines.

Virtual machines are specified by a collection of text-based .tf files, that can be stored and versioned in the project repository itself.

The arguments in favor of such an approach are similar to the arguments in favor of having a versioned specification for CI/CD pipelines (.gitlab-ci.yml, Jenkinsfile, etc.):

  • having both cluster configuration and pipeline specification versioned in the project repository allows developer to keep track the history of the continuous integration environment: if something breaks, it is easier to see what has changed, in order to fix it or to roll back to a working version.

  • the specification is a reliable part of the documentation of the continuous integration environment, and the history makes easier to follow its evolution, in particular in the case where the maintenance charge is shared among multiple developers.

  • the specification makes the environment easily reproducible for other projects or other platforms, and common parts can be shared and factored. In particular, if the environment is lost (if virtual machines are accidentally destroyed), it can be automatically restored.

Moreover, the Terraform language provides us a convenient API to interact with CloudStack (the platform behind ci.inria.fr), for instance to create and destroy virtual machines on demand. This particular point is illustrated in the project terraform-dynamic.

The Terraform configuration language is documented on HashiCorp website.

Prerequisites

In addition to CI/CD features and shared runners (see the Prerequisites section in the intro project), projects using Terraform need:

  • A project on ci.inria.fr: see the tutorial to create a new project. (Note: Terraform will never touch on virtual machines that it didn't create itself and these virtual machines will not prevent it to create its own virtual machines, unless if there are naming conflicts. Therefore, you don't need to start from an empty project.)

  • A dedicated user on ci.inria.fr, which has the Slave Admin role on this project. This user is recommended to be specific for this project (this requires to use a specific email address, or a specific suffix for mail accounts that support it).

  • Cloudstack API and secret keys should be added as variables CLOUDSTACK_API_KEY and CLOUDSTACK_SECRET_KEY of type Variable in CI/CD settings. See the intro project for details on how to set a CI/CD variable. To get the Cloudstack API and secret keys:

    • Go to https://sesi-cloud-ctl1.inria.fr/
    • Login as the dedicated user, with ci/project-name as domain.
    • On the left sidebar, go to Accounts.
    • Select admins, then View Users.
    • Select the dedicated user.
    • Copy the fields API Key and Secret key.
  • Gitlab runner registration token should be added as a variable TF_VAR_REGISTRATION_TOKEN of type Variable in CI/CD settings. This token will allow virtual machines deployed by Terraform to register themselves as GitLab runners to execute jobs in the project pipeline. The TF_VAR_ prefix makes Terraform bind the value to the var.REGISTRATION_TOKEN variable inside Terraform configuration files (Terraform documentation). To get the registration token:

    • On the left sidebar in the Gitlab interface, go to SettingsCI/CD.
    • Expand Runners.
    • In the Specific runners section, you will find it after the label And this registration token:. There is a button Copy token just after the token to copy it to the clipboard. (Note: in this example, the same project is used for hosting the Terraform configuration files and the build steps that are executed on the virtual machines maintained by Terraform, but this is not necessary the case. A distinct project can be dedicated to maintain the infrastructure by Terraform, with a variable REGISTRATION_TOKEN that points to another project dedicated for the build itself.)
  • This project needs a pair of passphrase-less SSH private/public keys for the GitLab shared runner to be able to connect to the deployed runners to unregister them from GitLab before deletion. You can use the following command to create a pair of SSH private/public keys without passphrase in the current directory (files id_rsa and id_rsa.pub): ssh-keygen -b 4096 -f id_rsa -N "".

    • The contents of the private key file id_rsa should be added as a variable SSH_PRIVATE_KEY of type File in CI/CD settings. See the intro project for details on how to set a CI/CD variable.
    • The public key file id_rsa.pub should be registered on ci.inria.fr portal to allow the dedicated user to connect to the hosted virtual machines (portal documentation). for details on how to register a public key on the portal. The contents of the public key file should also be added as a variable TF_VAR_SSH_PUBLIC_KEY of type Variable in CI/CD settings: the Terraform configuration file main.tf substitutes the public key in the cloud-init script template cloud-init.sh.tftpl, to register the key in the file ~ci/.ssh/authorized_keys in deployed virtual machines.
  • The repository contains a backend.tf file for connecting Terraform with GitLab. This may be convenient to delete it locally for running terraform directly on the development machine to experiment before committing changes to Gitlab: in these settings, we prefer to use the default (local) backend. However, we do not want git to keep track of this deletion: we can run locally git update-index --assume-unchanged backend.tf after having removed backend.tf for git to ignore this change.

  • The pipeline defined in .gitlab-ci.yml contains a non-blocking step fmt to check that *.tf files are formatted accordingly to the Terraform guidelines. Automatic reformatting can be performed locally with the command terraform fmt. The file .pre-commit-config.yaml defines a pre-commit hook to validate configuration files and perform automatic reformatting before each commit: you may enable it by installing pre-commit and initializing your repository with the command pre-commit install.

The Terraform configuration file main.tf

The configuration file main.tf contains some sections to set up CloudStack as a resource provider, and then the specification of the resources themselves.

As explained in the previous section, the secret REGISTRATION_TOKEN is passed through a variable.

variable "REGISTRATION_TOKEN" {
  type      = string
  sensitive = true
}

The variable is marked as sensitive to prevent Terraform from showing its value in output (Terraform documentation).

The SSH public key is passed through the variable SSH_PUBLIC_KEY.

variable "SSH_PUBLIC_KEY" {
  type = string
}

The value of the SSH_PUBLIC_KEY variable will be stored in the file ~ci/.ssh/authorized_keys in virtual machines, so that Terraform can connect to the virtual machines with the private key to unregister the runners before destroying the machines.

In this example, we set up three resources: a virtual machine running on Ubuntu 20.04, a virtual machine running on Windows 10, and a virtual machine running on Mac OS X 15. The three virtual machines register themselves as runners on gitlab.inria.fr: the Ubuntu machine provides a docker executor, Windows and Mac OS X provide shell executors (powershell for Windows, bash for Mac OS X).

Ubuntu 20.04 virtual machine

resource "cloudstack_instance" "ubuntu" {
  ## It is a good practice to have the "{project name}-" prefix
  ## in VM names.
  name             = "gitlabcigallery-terraform-ubuntu"
  service_offering = "Custom"
  template         = "ubuntu-20.04-lts"
  zone             = "zone-ci"
  details = {
    cpuNumber = 2
    memory    = 2048
  }
  expunge = true
  user_data = templatefile("cloud-init.yaml.tftpl", {
    REGISTRATION_TOKEN = var.REGISTRATION_TOKEN
    SSH_PUBLIC_KEY     = var.SSH_PUBLIC_KEY
  })
  connection {
    type                = "ssh"
    host                = self.name
    user                = "ci"
    private_key         = file("id_rsa")
    bastion_host        = "ci-ssh.inria.fr"
    bastion_user        = "gter001"
    bastion_private_key = file("id_rsa")
  }
  provisioner "remote-exec" {
    when   = destroy
    inline = ["sudo gitlab-runner unregister --all-runners || true"]
  }
}
  • custom_instance is an identifier for the resource, which can be used to refer to it elsewhere in the Terraform configuration;
  • gitlabcigallery-terraform-ubuntu is the name of the virtual machine: by convention, the prefix gitlabcigallery-terraform is the name of the project on ci.inria.fr.
  • The service offering Custom allows us to specify the characterics of the virtual machine in the details section:
    • cpuNumber should be between 1 and 16 (cores),
    • memory should be between 1024 and 24576 (GB).
  • template can refer to a template by name or ID. The available templates can be listed with the ci.inria.fr portal in the virtual machine creation form (portal documentation). We rely here on the fact that cloud-init is installed in the template and takes into account the CloudStack user-data (CloudStack documentation for cloud-init support). We could also use a remote-exec provisioner to connect the virtual machine via SSH to execute an initialization script on first boot: we will use this method with Windows and Mac OS X virtual machines, since cloud-init only exists on Linux.
  • There is only one zone, zone-ci, and expunge should be set to true to ask CloudStack to destroy the virtual machine immediately when Terraform needs to replace it (by default, virtual machines are kept during 24h after deletion, which prevents Terraform for recreating a machine with the same name).
  • user_data contains a script which is passed to cloud-init to be run at the first boot of the virtual machine. The templatefile is used to read the cloud-init configuration file from file cloud-init.yaml.tftpl by substituting ${REGISTRATION_TOKEN} with the value of the variable passed to Terraform.
  • We pass also the SSH_PUBLIC_KEY to the template file to have its value written in the ~ci/.ssh/authorized_keys file. We configure the connection via ssh to the runner: gter001 is the login of the dedicated user on ci.inria.fr, and we will make sure in the next section that the private key is written in the file id_rsa. We cannot use a variable for passing the path to this file, since the connection is used by a destroy provisioner, that cannot refer to variables. This destroy provisioner executes gitlab-runner unregister before the destruction of the virtual machine; failures are ignored in case of the gitlab-runner command was not yet installed when destroying occurs.

Windows 10 virtual machine

resource "cloudstack_instance" "windows" {
  ## It is a good practice to have the "{project name}-" prefix
  ## in VM names.
  name             = "gitlabcigallery-terraform-windows"
  service_offering = "Custom"
  template         = "windows10-vs2022-runner"
  zone             = "zone-ci"
  details = {
    cpuNumber = 2
    memory    = 2048
  }
  expunge = true
  connection {
    type                = "ssh"
    host                = self.name
    user                = "ci"
    password            = "ci"
    bastion_host        = "ci-ssh.inria.fr"
    bastion_user        = "gter001"
    bastion_private_key = file("id_rsa")
    target_platform     = "windows"
  }
  provisioner "remote-exec" {
    inline = [<<-EOF
      gitlab-runner start
      gitlab-runner register --non-interactive --tag-list terraform,windows --executor shell --shell powershell --url https://gitlab.inria.fr --registration-token ${var.REGISTRATION_TOKEN}
      EOF
    ]
  }
  provisioner "remote-exec" {
    when   = destroy
    inline = ["gitlab-runner unregister --all-runners || true"]
  }
}
  • It is essential to specify target_platform = "windows" for the SSH connection to work.

  • gitlab-runner service is not started on boot by default in the template: we start the service explicitely before registering the runner.

Mac OS X 15 virtual machine

resource "cloudstack_instance" "macos" {
  ## It is a good practice to have the "{project name}-" prefix
  ## in VM names.
  name             = "gitlabcigallery-terraform-macos"
  service_offering = "Custom"
  template         = "osx-15-runner"
  zone             = "zone-ci"
  details = {
    cpuNumber = 2
    memory    = 2048
  }
  expunge = true
  connection {
    type                = "ssh"
    host                = self.name
    user                = "ci"
    password            = "ci"
    bastion_host        = "ci-ssh.inria.fr"
    bastion_user        = "gter001"
    bastion_private_key = file("id_rsa")
  }
  provisioner "remote-exec" {
    inline = [<<-EOF
      set -ex
      (
        export PATH=/usr/local/bin:$PATH
        gitlab-runner register --non-interactive --tag-list terraform,macos --executor shell --url https://gitlab.inria.fr --registration-token ${var.REGISTRATION_TOKEN}
      ) >~/log.txt 2>&1
      EOF
    ]
  }
  provisioner "remote-exec" {
    when = destroy
    inline = [<<-EOF
      export PATH=/usr/local/bin:$PATH
      gitlab-runner unregister --all-runners || true"
      EOF
    ]
  }
}
  • In the remote-exec provisioner, outputs are redirected to ~/log.txt to ease debugging, since they are not shown in GitLab log.

  • The executable gitlab-runner is in /usr/local/bin, which is added to PATH by .bashrc (or .zshrc), which is not sourced when commands are executed via SSH non interactively. Therefore, we add /usr/local/bin specifically to PATH before executing gitlab-runner (we could have sourced . ~/.bashrc instead).

The cloud-init configuration file cloud-init.yaml.tftpl

The cloud-init configuration file cloud-init.yaml.tftpl sets up the following:

  • the user ci can execute sudo without password, so that the destroy provisioner be able to unregister the runners;
  • the SSH public key is registed as authorized key for ci,
  • by default, password authentication is disabled for ci (you may add lock_passwd: false to enable it again, documentation);
  • gitlab-runner and docker.io is installed on the virtual machine, the runner is registered on gitlab.inria.fr. The configuration file should begin with the following line.
#cloud-config

You may provide a shell script with the according shebang (#!/bin/sh) instead.

The pipeline specification file .gitlab-ci.yml

The pipeline specification file .gitlab-ci.yml relies on the template Terraform/Base.gitlab-ci.yml, provided by gitlab.com. The usage of this template is described in the Gitlab documentation.

This template uses a Docker container $CI_TEMPLATE_REGISTRY_HOST/gitlab-org/terraform-images/releases/1.1:v0.43.0, where CI_TEMPLATE_REGISTRY_HOST is by default set to registry.gitlab.com: to use it on shared runners, we have mirrored this Docker container locally to circumvent quotas. The container is available registry.gitlab.inria.fr (that we use for CI_TEMPLATE_REGISTRY_HOST), in the project gitlab-org/terraform-images.

There are five stages:

stages:
  - validate
  - build
  - deploy
  - execute
  - destroy
  • In the stage validate, the step validate checks that there is no error in *.tf files and the step fmt checks that they are properly formatted according to the guidelines (this fmt step is non-blocking: the step can fail without stopping the pipeline).

  • In the stage plan, the step build plans the modifications to apply to CloudStack to conform the configuration file. The plan will be stored as artifacts.

  • In the stage deploy, the homonymous step applies the plan. This step is configured to be triggered manually: it can be triggered by clicking on Play in the project Pipelines page (on the left sidebar in the Gitlab interface, go to CI/CDPipelines).

  • In the stage execute, the homonymous step executes a command using the docker-based GitLab runner hosted in the deployed virtual machine. (Note: we chose in this example to perform the execute stage in the same project, but we could have chosen to register the runner to another project by adjusting the REGISTRATION_TOKEN variable.)

  • In the stage destroy, the homonymous step is to be run manually and executes gitlab-terraform destroy, which destroys the virtual machines (and these virtual machines have remote-exec destroy provisioners that unregister themselves from gitlab.inria.fr).

Every job that will use the Terraform configuration file needs to copy the file referred by SSH_PRIVATE_KEY into the file id_rsa. To copy the file without overriding all the script, we use the before_script key: defining the before_script key at top-level, outside any job, makes the file be copied before every job.

before_script:
  - cp $SSH_PRIVATE_KEY id_rsa

Ignored files in .gitignore

.gitignore instructs git to ignore the local files generated by the terraform command, in the case this command is used locally for experimentation.