Introductory documentation to GitLab @ INRIA
-
FAQ - Frequently Asked Questions
- Good usage of the gitlab?
- Is there a tutorial about git and gitlab at Inria ?
- Why can't I connect to GitLab as an Inria member or external collaborator?
- Should I choose "Clone with SSH" or "Clone with HTTPS"?
- How to not have to type a password for every git operation
- How to create/manage external (non-Inria) user accounts
- What needs to be done when a user departs and has projects in their personal space?
- How to convert a user account from internal to external
- How to convert a user account from external to internal
- How to activate a commit email hook
- How to rewrite history
- How to add users of an INRIA ildap group (eg. team members) to a gitlab group
- How to use the continuous integration (CI) service?
- What are git-lfs's recommended uses?
- Quota management policy to monitor disk space by project on Gitlab
-
How can I manage and reduce my disk space?
- When does the project disk space monitoring script run?
- What is taken into account when calculating the disk space occupied by my project?
- How much storage space is my project taking up?
- How much space is occupied by all my projects?
- How much space is taken by the artifacts and logs at each pipeline execution?
- How do I remove a large file or a binary added by mistake?
- How can I reduce the size of a repository?
- How can I reduce the various spaces occupied by a GitLab project?
- Why doesn't disk space decrease despite deletions?
- How do I clean the pipelines (logs, artifacts) ?
- How do I delete the oldest or most recent artifacts?
- How do I automate storage management for CI/CD pipelines/job logs/job artifacts?
- How do I automate storage management for container registries?
- How do I automate storage management for package registries?
- How do I manage storage used by package assets?
- Merge Request development model and external contributors
- Gitlab-Pages
- Broken Links with Uploaded Files in a Project Wiki
FAQ - Frequently Asked Questions
Good usage of the gitlab?
Gitlab is initially and mainly aimed at providing a set of tools for collaborative work on software development projects. Additionally, it can also be used for collaborative work on projects not directly related to software development, such as collaborative edition of scientific publications, since source code or text edition are similar.
Keep in mind that the design and sizing of gitlab have some limitations:
- Git is designed to keep a complete history of operations. It is not expected to change this history and doing so is rather convoluted (see related faq entry). This means that if you put a file under git and later remove it, the file will be removed only from the current and following revisions, not from the old ones. Disk space is not reclaimed. If you commit different versions of a single file, each version will be stored and uses disk space.
- Git is best used to store text files (source code, latex source, etc.), because they compress well.
- Git is designed initially to version source code. The size of a git repository such as the linux kernel repository is in the order of 100 to 200 MB (and this is a rather large project). Repositories with sizes wich are one order of magnitude or more greater will be slow.
- The worst case is storing big binary files (they don't compress well)
- Additionally, having huge repositories will impact not only git, but also gitlab as a whole:
- The web interface will be slow (eg. looking at big commits in web interface will timeout)
- Continuous Integration will be impacted (eg. need to have big runners, build artifacts will be huge, and jobs may fail or timeout)
Hence, some gitlab usages are rather inappropriate:
- Using source control to store a lot of binary data (either lots of medium to big files, or a few enormous files).
- In particular, it is bad usage to use the git to publish and share big experimental datasets (inputs or outputs), virtual machine images, and more generally any kind of "big data". You should, instead, version the code for generating this data or vm images.
Alternative solutions:
- If you need to publish huge volumes of data, you need to find some network storage outside of the gitlab. See with your IT contacts for solutions, or find a funding for your own servers / space. The gitlab is not a Dropbox alternative for big data!
- git-lfs may be a good solution.
- An alternative to git-lfs is git-annex.
The frontier between good and bad usage of the gitlab is fuzzy, but as there are a lot of users and projects on the gitlab, if too much people abuse the system we will have to put hard limits and restrictions on the gitlab usage, which will be annoying for everyone.
Is there a tutorial about git and gitlab at Inria ?
Yes, follow this link.
Why can't I connect to GitLab as an Inria member or external collaborator?
https://gitlab.inria.fr/siteadmin/doc/-/wikis/home#gitlab-accounts
Should I choose "Clone with SSH" or "Clone with HTTPS"?
We advise you to choose "Clone with SSH".
Indeed, the amount of data that can be transferred in an HTTP request or response is capped, which can cause issues when cloning large repositories over HTTPS.
If you encounter problems with git push on a repository cloned with git https, here is the command to change the remote configuration to SSH :
$ git remote -v
origin https://gitlab.inria.fr/OWNER/REPOSITORY.git (fetch)
origin https://gitlab.inria.fr/OWNER/REPOSITORY.git (push)
$ git remote set-url origin git@gitlab.inria.fr:OWNER/REPOSITORY.git
# Verify new remote URL
$ git remote -v
origin git@gitlab.inria.fr:OWNER/REPOSITORY.git (fetch)
origin git@gitlab.inria.fr:OWNER/REPOSITORY.git (push)
See also the following section
On the other hand, "Clone with HTTPS" can be useful in read-only mode.
How to not have to type a password for every git operation
First of all to avoid having to type login/password each type, the best is to use git over ssh rather than git over https (see below for git over http)
On your local computer:
- git relies on ssh for the connexion to the gitlab's git server
- ssh uses your ssh private key to authenticate to the gitlab's git server
- your ssh private key is stored in your home, on your local computer's hard disk. This ssh private needs to be crypted, so that if your local computer is breached or stollen, the key cannot be used. This is the reason why it is crypted with a so-called "passphrase". This passphrase is actually a password, but you are encouraged to use something longer and more complex than a password (beacuse if this passphrase is "cracked", it can lead to a breach on many servers, so from the security point of view, it is critical)
Up to now, everything works correctly, but you have to type your ssh passphrase each time you connect by ssh, thus at each git operation involving the remote git server (push, pulls, ...)
The ssh-agent's role is to avoid that: it is a software running in background on your local computer. Each time your local ssh client needs to connect to a server, if it detects a running ssh-agent, it first tries to get the decrypted ssh private key from it. The first time, the ssh-agent asks for the passphrase to decrypt the key, then keeps the decrypted key in memory as long as you don't log out of your local computer.
This is the ssh-agent that allows not having to enter the passphrase at each ssh connection.
By default ssh client is installed with a command line ssh-agent, but modern operating systems / graphical environments come with integrated "keyrings" which can act as ssh-agents (and can as well manage other kind of secrets, such as passwords, gpg keys, etc.):
- for OS X: Keychain
- for linux / gnome: gnome-keyring
- for linux / kde: kde-wallet
- for other linux desktop environments, or for windows: ?
These "keyrings" work as follows: When you login on your local computer, your login password is automatically used to unlock the secrets stored in the keyring, of which the ssh keys are part.
Thus to be able to perform connected git operations (push, pull, ...) without having to type any password you need:
- to use git over ssh
- to have a ssh keypair protected by a passphrase
- to use a command line or graphical ssh-agent
Note: it is also possible to do passwordless operations with git over http by using a personal access token:
- create a personal access token in your gitlab user settings, with permissions "read_repository" (only these permissions).
- when cloning the repository, use customize the URL as follows: https://:@gitlab.inria.fr/\<end_of_url>
How to create/manage external (non-Inria) user accounts
Gitlab is a service which may be made available for external users. Their account is managed via the dedicated Inria external account portal.
The procedure and detailed information are provided here.
What needs to be done when a user departs and has projects in their personal space?
Once a user's account is deactivated, their personal space becomes inaccessible. However, the space and its contents are not deleted. The user should move the projects in personal space to appropriate group spaces.
How to convert a user account from internal to external
When a user is leaving Inria, the gitlab account cannot be used anymore. To keep the accesses to the projects, please follow this procedure, between the user's departure and 2 months later. The request must not be made before the contract end date, as the administrators will only be able to carry out the conversion once the person's computer account has been closed (the departure date has passed).
Anyone with an active Inria account can open a ticket on helpdesk.inria.fr in the category "Demande sur GitLab", requesting to convert the account and providing :
- the name/firstname/login of the person leaving Inria,
- the date of the departure from Inria,
- the external email that will be used as identifier and for password recovery,
- the name of the sponsor at Inria and
- until when the account is needed (maximum is 3 years, renewable by the sponsor).
Before doing so, the user must ensure that he has no image in the container registry in any of his private projects. If this is not the case, the conversion operation will fail because the user's login cannot be modified. If the user wants to keep the registry content, s.he can make a docker pull to a local destination, before making a docker push to the new repository (with the new external login).
The conversion will be made after the user has left Inria, when the Inria account is deactivated :
- an administrator makes the conversion to the new account and gives the new identifier to the user;
- then, the user can ask for password recovery using this new identifier and provide a new password.
For Mattermost, the transfer can not be made automatically :
- the user has to be removed from his/her subscriptions ;
- the service Mattermost has to be added to the new external account and
- the new account has to be added manually to each Mattermost team and canal of interest.
How to convert a user account from external to internal
External users can one day be hired by Inria and ask for an internal gitlab account. In this situation the external account can be converted into an internal one with proper inria email and iLDAP parameters. To do so the user must open a ticket (https://helpdesk.inria.fr/, section gitlab) requesting to convert the existing external account into an internal one, providing:
- the current external account login (e.g. x-Jdoe),
- the inria email
- and the iLDAP login.
How to activate a commit email hook
Gitlab's model is that instead of setting up a centralized commit email hook per project, it is rather up to the users to choose his/her notification level per project. All users can set their 'global' notification level setting in their account settings. Then they can override this default global notification level per project on top of the project details page, with the little bell (This way, a project cannot impose a notification level to its members).
If you still want something similar to the old commit email hooks, then in your project configuration, section "Integrations/Project services", activate the "Emails on push" service. Then, unfortunately, you need to enter the explicit list of recipients. We haven't found a way to configure this such that the list of project members is automatically used as the recipients list
How to rewrite history
As documented in https://gitlab.inria.fr/help/user/permissions project members with roles Developer or Maintainer or Owner are allowed to force push to non protected branches. If you want to rewrite history of a protected branch, You need first to switch the branch to non-protected (only Maintainer or Owner can do that), do the force-push, then revert the branch to protected.
How to add users of an INRIA ildap group (eg. team members) to a gitlab group
Open a ticket, specifying the INRIA ildap group name, the gitlab group name, and the access level (GUEST, REPORTER, DEVELOPER, MAINTAINER, OWNER) that you wish to give to added members. We will run a script which adds existing gitlab users matching the ildap group members to the gitlab group. Be aware that inria members need to connect at least one time to gitlab for their account to be created.
Provided you are in the INRIA network and have access to ldap://ildap.inria.fr, you can search for a groupname with a command such as:
$ ldapsearch -x -H "ldap://ildap.inria.fr" -b "ou=groups,dc=inria,dc=fr" | grep -i <team name>
For verification purposes you can check the list of ildap group members with a command such as:
$ ldapsearch -x -H "ldap://ildap.inria.fr" -b "ou=People,dc=inria,dc=fr" "(inriaGroupMemberOf=cn=<ldap group name>,ou=groups,dc=inria,dc=fr)"
How to use the continuous integration (CI) service?
Besides using Inria's Continuous Integration platform ci.inria.fr, you can use gitlab's integrated CI pipelines, see the official documentation and the complete gitlab-ci yaml reference.
Concerning the machines setup to run the tests you can either use existing "shared runners" (to be used with docker images), see section Use existing shared runners hereafter, or you will need to install gitlab-runner on your specific machine and then register the runner with your project, see section Installing your own specific runners.
In addition, if you look for "real life" gitlab-ci examples, please visit this dedicated group gitlabci_gallery. It contains several subgroups and git repositories showing some interesting key features of gitlab-ci and possible integrations with external tools/platforms (ci.inria.fr, terraform, github, a supercomputer, etc).
Enabling CI on a gitlab project
- Go to your project's settings
- In the General Permissions tab, search for "CI/CD", enable and do not forget to click on Save changes
- Then a CI / CD tab should now appears in the settings
- In this CI / CD tab, you will find the url and the registration token you will need later, when adding a specific runner.
Use existing shared runners
If you don't want to manage runners yourself you can use some existing ones provided by ci.inria.fr where jobs can be executed in docker containers. You will see the shared runners available in the Settings CI/CD Runners Shared runners (right column). Please read the available documentation for more information. The shared runners require to use tags, e.g.
myjob:
tags: ['ci.inria.fr', 'small']
image: docker:19.03.12
script:
- whoami
Using a Custom GitLab Runner with GPU Support
The shared GitLab runners do not support GPU workloads. If your project requires GPU (e.g., for training ML models), you must configure your own runner on a machine with GPU access, cf Installing your own specific runners.
We recommend that you use light installations, for example with this command tried and tested by Pascal Carrivain :
apt-get -y install --no-install-recommends cuda-compiler-12-8 cuda-runtime-12-8 cuda-profiler-api-12-8 libcurand-12-8 libcurand-dev-12-8
instead of complete installation, which includes the profiler and NVIDIA Nsight Graphics :
apt-get -y install cuda-toolkit-12-8
-
Grid'5000 (Public Infrastructure)
National testbed with GPU nodes for research.
Request access here: https://www.grid5000.fr
-
Local Compute Resources (SED)
Use GPU-equipped machines provided by your team or computing center.
Contact an engineer at your SED to discuss access options and possible setup assistance.
Installing your own specific runners
gitlab-runner
is a program (executable) to be installed on the machine which will execute the jobs.
Then, after installation, a runner registration allows the machine to contact the gitlab's instance (e.g. gitlab.inria.fr) and trigger jobs coming from one specific or several gitlab's projects, see runners scope and enable a project runner for a different project.
You can use a virtual machine on ci.inria.fr (Linux, Windows or MacOS) to host your GitLab runner.
See:
- https://inria-ci.gitlabpages.inria.fr/doc/page/web_portal_tutorial/ to create a slave on Inria's CI platform and access it (you can ignore the jenkins related parts)
- https://docs.gitlab.com/runner/install/ for the official documentation to install the runner on the vm(s) you've created.
Installation example on a GNU/linux slave from ci.inria.fr ("ci" user account)
sudo curl -L --output /usr/local/bin/gitlab-runner "https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-linux-amd64"
sudo chmod +x /usr/local/bin/gitlab-runner
sudo gitlab-runner install --user=ci --working-directory=/builds
sudo gitlab-runner start
sudo gitlab-runner status # should return "service is running"
Installation example on a macOS slave from ci.inria.fr ("ci" user account)
Limitations on macOS : The service needs to be installed from a Terminal window logged in as your current user (i.e., "ci" user account).
sudo curl --output /usr/local/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-darwin-amd64
sudo chmod +x /usr/local/bin/gitlab-runner
# Run the following commands as the "ci" user
gitlab-runner install --working-directory /builds
gitlab-runner start
gitlab-runner status # should return "service is running"
Installation example on a Windows slave from ci.inria.fr ("ci" user account)
# Follow https://docs.gitlab.com/runner/install/windows.html then when installing the service prefer the following to use the existing "ci" user account
gitlab-runner install -u ".\ci" -p "ci" -d "C:\Users\ci"
gitlab-runner start
gitlab-runner status # should return "service is running"
Register a runner to trigger project's job
The gitlab-runner
program allows to contact gitlab to trigger jobs of a particular project.
To initiate the communication between the machine and the gitlab's project one has to register a new runner.
To do so, visit your gitlab's project, go to Settings -> CI/CD -> Runners and click on New project runners.
You can add tags to be able to identify the type of machine (e.g. 'linux', 'ci.inria.fr', 'debian', ...).
Then, click on Create runner.
Copy/paste command line given in the Step 1 section in a shell terminal on the virtual machine where you installed the gitlab-runner
program.
Run as root or sudo if the gitlab-runner
program has been installed with sudo, remove sudo from the following if not:
sudo gitlab-runner register --url https://gitlab.inria.fr --token glrt-t3_8JZybA2_M3xx4zndxiyA
Several questions must be answered:
- "Enter the GitLab instance URL" -> enter key (https://gitlab.inria.fr is the right one)
- "Enter a name for the runner" -> let the one by default (the local hostname) or type a different one, enter key
- "Enter an executor" -> make the choice in the given list, it will usually be shell (use the current account, system environment and shell to run jobs) or docker (to run jobs in a docker image given in gitlab-ci jobs definition, see
.gitlab-ci.yml
).
At the end of this step, your runner should appear in the Settings -> CI/CD -> Runners -> Assigned project runners tab of your project.
This specific runner can be removed (unregistered) as follows (in a shell terminal of the machine):
sudo gitlab-runner unregister --url https://gitlab.inria.fr --token glrt-t3_8JZybA2_M3xx4zndxiyA
and by clicking on Remove runner in the Assigned project runners tab of the gitlab's project.
Configurating CI tasks
Finally, configure the tasks to run by creating a .gitlab-ci.yml file at the root of your project.
Follow the official documentation at https://docs.gitlab.com/ce/ci/yaml/ to create this file.
Troubleshooting gitlab runners
A few logging-related issues were found in older gitlab-runner versions. When using a dedicated (non-shared) runner, you might want to ensure that your gitlab-runner version is up to date
If you have any issue with gitlab runner (for example, an online runner but no debug logs) during your pipeline execution, the following official documentation may help you. For example, it may show you how to enable debug logging on your runner.
Using a docker executor
It has been reported that there are DNS issues with docker running on the INRIA CI's VMs (due to bad interaction between dnsmasq and docker, see https://stackoverflow.com/questions/49998099/dns-not-working-within-docker-containers-when-host-uses-dnsmasq-and-googles-dns). This can be solved by adding network_mode = "host"
in the configuration. Eg. /etc/gitlab-runner/config.toml
:
concurrent = 1
check_interval = 0
[[runners]]
name = "ci.inria"
url = " [ https://gitlab.inria.fr/ | https://gitlab.inria.fr/ ] "
token = "..."
executor = "docker"
[runners.docker]
network_mode = "host"
tls_verify = false
image = "alpine:latest"
privileged = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
What are git-lfs's recommended uses?
What is Git LFS?
Git LFS (Large File Storage) is a Git extension that efficiently handles large files by storing them separately from the main Git repository, possibly on a separate server. It is designed to manage large binary files such as images, videos, simulation data, etc., without sacrificing Git's performance or version management.
In fact, Git is well suited to managing text files (based on differences) such as scripts or code, .tex articles or documentation, a laboratory notebook, an html website or system configuration files.
On the other hand, Git is less suited to managing large or binary files, hence the existence of the Git-LFS extension for versioning via symbolic links: Git Large File Storage (LFS) replaces large files such as audio samples, videos, data sets and graphics with text pointers inside Git, while storing the contents of the files on a remote server.
You can use the Git-LFS functionality by using the space provided by the Inria Gitlab services. In this case, the space used is deducted from your disk space quota.You can also use disk space not provided by the Inria Gitlab services by using your own servers (see below https://gitlab.inria.fr/siteadmin/doc/-/wikis/faq#deploying-your-own-git-lfs-server). In this second case, the space used by Git-LFS will not be deducted from your authorised quota on Inria Gitlab services.
Examples of use cases recommended on Inria's gitlab services
A Git-LFS storage service has been activated on the gitlab.inria.fr and gitlab-int.inria.fr servers. The recommended uses of the space thus made available (5TB in February 2025) are limited to data directly related to development. The volume consumed in this way counts towards the disk space quotas made available to projects.
Recommended uses
Data directly related to development are for example :
- Test datasets
- Datasets used in tutorials or examples
- Reference result data
- Images, 3D models, or datasets distributed with the software
- Traces to understand the performance of each version
- Reference articles, figures or user documentation in binary formats (pdf, presentations, etc.)
Uses not recommended
On the other hand, the following are not considered to be directly related to development:
- Data for training AI models
- Software compilation results
- Compiled versions of dependencies
- The generated data
- Calibration files for checking hardware devices
For these purposes of data storage and/or distribution to third parties, the data must be stored elsewhere, for example on a web space hosted on the https://files.inria.fr server up to 25GB (https://doc-si.inria.fr/display/SU/Espace+web).
Git LFS should not be used to back up documents either. To meet this need, a dedicated service has been deployed by the DSI: https://doc-si.inria.fr/display/SU/Sauvegardes#
Deploying your own Git-LFS server
Reference documentation by Erwan Demairy: a minimal example where git-lfs data is stored on its own server. https://gitlab.inria.fr/sdt-pfo/tools/gitlab-test-lfs
Reference documentation on Git-LFS
https://docs.gitlab.com/ee/topics/git/lfs/
https://docs.gitlab.com/ee/topics/git/lfs/#removing-objects-from-lfs
https://about.gitlab.com/blog/2017/01/30/getting-started-with-git-lfs-tutorial/
Quota management policy to monitor disk space by project on Gitlab
Cf documentPolitique_Gestion_Quotas_GitLab.pdf (Missing link on page 2 of the document, point 5 : https://gitlab.inria.fr/dgdi-sdt/pfo/gitlab-test-lfs/-/tree/master?ref_type=heads)
Setting quotas per project is a mechanism for encouraging users to regularly review the relevance of the data they store.
Quotas are a way of encouraging users to regularly review their use of disk space. The aim is not so much to save money, but rather to make sensible use of disk space, while allowing projects to handle large volumes of data if necessary.
The general recommendation is not to have Git repositories larger than 1 GB to preserve performance. At present, each project has 20GB of storage space by default. As soon as 80% of disk space is used, repository owners will be warned and invited to clean up their files and reduce their size as much as possible. When a project exceeds the threshold of 20 GB, it will remain accessible for reading but it will no longer be possible to make modifications within the project.
Beyond this default blocking threshold, a dialogue with the DSI and the PFO team will be initiated, via a ticket to the GitLab team on the helpdesk, to assess whether the use of git (including git-lfs) generating a large use of disk space on community servers is a justified approach.
Known limits
- LFS objects are pushed to the server despite the blocking of the commits.
- Git LFS uses the SHA-256 hash method to generate the LFS object identifier. It is therefore only stored once on the server, even if several projects reference it. However, it is counted in the quota of each project that references it.
- Alert or error notifications following an action via the web interface or a terminal are sent by e-mail the night following the action triggering the alert.
- A change in size is not taken into account immediately, either upwards or downwards. You have to wait a certain amount of time for the quotas to be updated by GitLab's internal mechanisms.
How can I manage and reduce my disk space?
When does the project disk space monitoring script run?
The script is a pre-receive hook based on the GitLab API, which is activated for certain operations on the GitLab server, such as commit, push, merge request, or branch and tag creations.
What is taken into account when calculating the disk space occupied by my project?
A project's storage space is the sum of the size statistics of the versioned files in the repository, LFS objects, wiki, job artifacts, pipeline artifacts, release packages, snippets and uploads. Container registry (docker images) is not included.
How much storage space is my project taking up?
To find out, go to Settings -> Usage Quotas for your project.
How much space is occupied by all my projects?
-
You can get the usage of resources across your projects from your GitLab profile : https://gitlab.inria.fr/-/profile/usage_quotas This will give you the storage consumed by all projects and project by project.
-
Alternatively, you can use GraphQL to find out how much space is occupied by each of your projects:
- Connect to https://gitlab.inria.fr/-/graphql-explorer
- Run the following query:
query {
projects(membership: true, search: "", sort: "name_asc") {
nodes {
name
fullPath
statistics {
storageSize
buildArtifactsSize
containerRegistrySize
lfsObjectsSize
packagesSize
pipelineArtifactsSize
repositorySize
snippetsSize
uploadsSize
wikiSize
}
}
pageInfo {
endCursor
hasNextPage
}
}
}
The result of the query returns, for each project, the space occupied by the various components in bytes. storageSize is the total size of a project.
How much space is taken by the artifacts and logs at each pipeline execution?
Statistics creation at each pipeline execution
How do I remove a large file or a binary added by mistake?
Here's how to rewrite the history, and in particular how to delete a commit or delete a file from each commit: https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History
For non-trivial cases of rewriting, here's a versatile tool : https://github.com/newren/git-filter-repo/
A step by step tutorial based on git filter-repo: https://docs.gitlab.com/ee/topics/git/repository.html#purge-files-from-repository-history
For specifically deleting objects from LFS : https://docs.gitlab.com/ee/topics/git/lfs/#removing-objects-from-lfs
How can I reduce the size of a repository?
It is always best to stay below the size limit, to avoid the project stalling.
This may involve destructive steps such as rewriting the history.
How can I reduce the various spaces occupied by a GitLab project?
https://docs.gitlab.com/ee/user/usage_quotas.html#manage-storage-usage
Why doesn't disk space decrease despite deletions?
After having done some cleaning, the storage quota can be recalculated on the page Settings -> Usage Quotas, button "Recalculate repository usage" and wait 15 minutes to get a refreshed page up-to-date (the result is not instantaneous).
- By default, housekeeping is run on a repository after 10 commits. Then, a grace period of 2 weeks preserves deleted objects. (Source)
- You can manually "Run housekeeping". Wait 30 minutes for the operation to complete. Then, "Prune unreachable objects": https://docs.gitlab.com/ee/administration/housekeeping.html#prune-unreachable-objects
Clean up repository: use this method to remove internal Git references and unreferenced objects.
Unreferenced LFS files are garbage-collected once a day : https://docs.gitlab.com/ee/raketasks/cleanup.html#remove-unreferenced-lfs-files
For GitLab admins
Incorrect repository statistics shown in the GUI
Usage quota shows incorrect artifact storage usage
Remove orphan artifact files
Delete old builds and artifacts
Delete old pipelines
How do I clean the pipelines (logs, artifacts) ?
In the course of time the size of data generated for gitlab-ci pipelines may grow quickly and sometimes reaches dozens of GB. The quantity of data stored on the gitlab's server for pipelines can be checked in the Settings -> Usage Quotas panel of the project, or in Build -> Artifacts. For projects with more than 10GB of artifacts there is certainly something to do to reduce the disk storage:
- From GitLab version 17.9, it is possible to set automatic deletion of old continuous integration pipelines.
- By cleaning old artifacts (cf https://gitlab.inria.fr/siteadmin/doc/-/wikis/faq#how-do-i-delete-the-oldest-or-most-recent-artifacts).
- By disabling the "Keep artifacts from most recent successful jobs" in the Settings -> CI/CD -> Artifacts if not necessary because it keeps all job's artifacts (build, test, etc) of all the git refs (branches, tags, merge requests, ...) and this can cost a lot.
- By changing the gitlab-ci jobs definitions, for example:
How do I delete the oldest or most recent artifacts?
By default, artifacts expire after 30 days.
As a bonus, in each project, artifacts from the most recent successful pipeline are stored and do not expire. This option can be deactivated at project level (Settings -> CI/CD -> Artifacts -> Keep artifacts from most recent successful jobs).
To clean old artifacts and pipelines in your projects, you can use:
- Automate storage management for CI/CD pipelines/job logs/job artifacts
- A CLI gitlab api application based on python-gitlab
- Gitlab Quotas Helper Component. This component contains templates to help users in cleaning their artifacts and pipelines in their CI/CD scripts : it adds a job to the CI to remove artifacts and/or pipelines older than N days. Click on Readme tab: https://gitlab.inria.fr/explore/catalog/sdt-pfo/tools/gitlab-quotas-helper-component
How do I automate storage management for CI/CD pipelines/job logs/job artifacts?
List/delete job artifacts, delete job logs, delete pipelines
How do I automate storage management for container registries?
Cleanup policy
List container registries, delete container images, create a cleanup policy for containers, optimize container images
How do I automate storage management for package registries?
How do I manage storage used by package assets?
In Settings -> Packages and Registries -> Manage storage used by package assets : When a package with same name and version is uploaded to the registry, more assets are added to the package. To save storage space, keep only the most recent assets. Select the number of duplicate assets to keep (by default, all are kept).
Merge Request development model and external contributors
This Inria Gitlab instance offers accounts both to Inria members (who login using their Inria LDAP credentials) and to external users (who can be invited/sponsored by Inria members through the external account portal: https://external-account.inria.fr ). The external (non-inria) users are limited to join existing projects, but cannot create projects on their own.
This limitation for external users has an important result: from the Gitlab point of view, forking a project is creating a project, thus external users cannot fork projects.
As a result, the well-known MR (Merge Request) development model cannot be used directly with external users, but some workarounds can be used in some situations.
The MR development model works as follows:
- A restricted core development team is member of a project and has commit access to the development tree
- Some contributor starts contributing by forking the project ("fork" button). When their contribution is ready, they open a new merge request (in the "new" menu).
- Someone from the core development team can then accept (or refuse) the Merge Request to be merged into the main development tree
This development model has several benefits:
- It limits the access to the main development tree to a small group of trusted core developers.
- It allows the core development team to control finely which contributions to merge or not.
- In particular, it shields the project and forces all contributions to pass through a validation workflow, such as ensuring that the Continuous Integration passes, or that all intellectual property constraints are satisfied, before merging some development.
- With tools such as Github or Gitlab, the web user interface for merge requests is very user friendly.
So, here are some possible workarounds to this development model when using the Inria Gitlab with external contributors:
- Use pull-requests mechanisms at the git level, rather than at the Gitlab level. This is the way that the Linux kernel has been developed for a long time now. Instead of submitting a MR with the Gitlab interface, contributors send patches by email.
- For projects with few known external contributors, it is possible for Inria members to create forks on behalf of the external contributors, then transfer ownership of these forks to the contributors. Here is a possible worklow:
- create a new group to handle the forks of your external collaborators
- take care to remain the only owner/maintainer of the group in order to control the creation/destruction of forks
- create a fork from the original project with as destination the newly created group. By default the name of the project will be the same as the original project, e.g.
https://gitlab.inria.fr/ForksOfFoo/bar
- then rename this forked project through Settings -> General -> Advanced Settings, e.g.
https://gitlab.inria.fr/ForksOfFoo/bar-jdoe
- finally add your collaborator maintainer on that fork
- For projects with few known and trusted external contributors, these contributors can be added to the project but develop in their own branches. There are some basic mechanisms in Gitlab to ensure they don't override their permissions: if these users have role developer, they won't be allowed to push to protected branches (master, by default). Details on Gitlab permissions can be found here: https://gitlab.inria.fr/help/user/permissions.md.
If these workarounds do not fit well with what you need and you absolutely want to use the standard MR development model with external contributors, the only solution is to use another tool than the Inria Gitlab.
Gitlab-Pages
- Gitlab-Pages are activated on our gitlab instance. Documentation can be found here: https://gitlab.inria.fr/help/user/project/pages/index.md
- Gitlab pages allow only static html content to be published (it means: no database, no php, etc.). The pages are generated as part of a continuous integration step, so you need a gitlab runner for the pages generation. You can use the shared runners available for this, see existing shared runners.
- There is a basic gitlab-pages example project here: https://gitlab.inria.fr/siteadmin/pages-example The generated web page is here: https://siteadmin.gitlabpages.inria.fr/pages-example/
- There are developed examples in our CI gallery, using Jupyter, Sphinx, Hugo and Doxygen : https://gitlab.inria.fr/gitlabci_gallery/pages
- You can also have a look at https://docs.gitlab.com/ee/user/project/pages/index.html#getting-started Note that there are templates for many website generators.
- Inria gitlab does not support custom domains and certificates. They are required if you want to have a custom domain that you own point to a gitlab pages generated site (optionnaly in https) but Inria has explicitely forbidden this functionnality.
- The first time Gitlab-Pages are activated, a question is asked: "Authorize GitLab Pages to use your account? An application called GitLab Pages is requesting access to your GitLab account. Please note that this application is not provided by GitLab and you should verify its authenticity before allowing access. This application will be able to: Access the authenticated user's API, Grants complete read/write access to the API, including all groups and projects, the container registry, and the package registry.". This is strange, but normal. The Gitlab-Pages server is a separate server, and for situation where pages are restricted to project members, the Gitlab-Pages server must use the authentication infrastructure of the Gitlab server. So you need to answer yes.
Broken Links with Uploaded Files in a Project Wiki
The symptom is that when clicking on links to uploaded resources, an error 404 is thrown. The link follows the pattern (/uploads/4d0b0eecbe6f0bd95a96f3b90cf64fe3/my_file.pdf)
. In that case, check that the file uploads/4d0b0eecbe6f0bd95a96f3b90cf64fe3/my_file.pdf
is not in the git repository of the wiki by:
- cloning the wiki repository (button at the upper right of the wiki);
- checking with
ls uploads/4d0b0eecbe6f0bd95a96f3b90cf64fe3/my_file.pdf
whether the file is stored in the repository. If it not, then you are in the situation addressed by this FAQ entry.
Explanation
The way the linked files are stored in the wiki has changed (cf. https://docs.gitlab.com/ee/user/project/wiki/#attachment-storage): previously to gitlab 11.3 release, the linked files of all the wikis were uploaded in a global /uploads directory. Since 11.3, they are uploaded to a uploads directory pertaining to the git repository of the wiki.
But the files that were uploaded previously to 11.3 have not been moved to the wiki repositories, causing:
- 404 errors since it appears the global repository is not used anymore when gitlab follows the links inside the wiki;
- the files are not present when cloning the wiki repository, making the clone incomplete.
Workarounds
If the file is not stored in the git repository of the wiki, two solutions exist:
Quick and Dirty Workaround
Change the (/uploads/4d0b0eecbe6f0bd95a96f3b90cf64fe3/my_file.pdf)
links to (https://gitlab.inria.fr/uploads/4d0b0eecbe6f0bd95a96f3b90cf64fe3/my_file.pdf)
.
Long-term Solution
It requires to download and the upload again each linked file:
- the file can be downloaded using the link given in the previous section;
- then upload the file using the "attach a file" button at the bottom right of the wiki editor.f