... | ... | @@ -355,16 +355,155 @@ To delete more than 100 pipelines one has to repeat this operation several times |
|
|
|
|
|
After having doing that the storage quota can be recalculated on the page Settings -\> Usage Quotas, button "Recalculate repository usage" and wait several minutes to get a refreshed page up-to-date (the result is not instantaneous).
|
|
|
|
|
|
## Git-LFS is not activated
|
|
|
## What are git-lfs's recommended uses?
|
|
|
### What is Git LFS?
|
|
|
|
|
|
The git-lfs feature is not activated. To version data or binary files, an option is to use [git-annex](https://git-annex.branchable.com/).
|
|
|
Git LFS (Large File Storage) is a Git extension that efficiently handles large files by storing them separately from the main Git repository, possibly on a separate server. It is designed to manage large binary files such as images, videos, simulation data, etc., without sacrificing Git's performance or version management.
|
|
|
|
|
|
Why isn't git-LFS activated? We observe that many users are trying to use git/gitlab to store huge volumes of data such as experimental datasets, experimental results, binary blobs, big data, etc. Git is not designed to handle such usage, and git-lfs would be the good answer, but because we do not have yet a working, production ready, quota system for gitlab, and because we do not yet have a clear strategy about how to manage such storage, we do not want to encourage such usages, and thus we decided to not enable git-lfs.
|
|
|
In fact, Git is well suited to managing text files (based on differences) such as scripts or code, .tex articles or documentation, a laboratory notebook, an html website or system configuration files.
|
|
|
|
|
|
You are welcome to tell us if git-lfs is a need for your usage. Your requests will be taken in account for future decisions on this matter.
|
|
|
On the other hand, Git is less suited to managing large or binary files, hence the existence of the Git-LFS extension for versioning via symbolic links: Git Large File Storage (LFS) replaces large files such as audio samples, videos, data sets and graphics with text pointers inside Git, while storing the contents of the files on a remote server.
|
|
|
|
|
|
## What are git-lfs's recommended uses?
|
|
|
Content coming soon.
|
|
|
![](https://notes.inria.fr/uploads/upload_8667897a6eacb89f4170bb9f59994406.png)
|
|
|
|
|
|
You can use the Git-LFS functionality by using the space provided by the Inria Gitlab services. In this case, the space used is deducted from your disk space quota.You can also use disk space not provided by the Inria Gitlab services by using your own servers (see below https://gitlab-int.inria.fr/siteadmin/doc/-/wikis/faq#deploying-your-own-git-lfs-server). In this second case, the space used by Git-LFS will not be deducted from your authorised quota on Inria Gitlab services.
|
|
|
|
|
|
### Examples of use cases recommended on Inria's gitlab services
|
|
|
|
|
|
A Git-LFS storage service has been activated on the gitlab.inria.fr and gitlab-int.inria.fr servers. The recommended uses of the space thus made available are limited to data directly related to development. The volume consumed in this way counts towards the disk space quotas made available to projects.
|
|
|
|
|
|
#### Recommended uses
|
|
|
Data directly related to development are for example :
|
|
|
1. Test datasets
|
|
|
2. Datasets used in tutorials or examples
|
|
|
3. Reference result data
|
|
|
4. Images, 3D models, or datasets distributed with the software
|
|
|
5. Traces to understand the performance of each version
|
|
|
6. Reference articles, figures or user documentation in binary formats (pdf, presentations, etc.)
|
|
|
|
|
|
#### Uses not recommended
|
|
|
On the other hand, the following are not considered to be directly related to development:
|
|
|
1. Data for training AI models
|
|
|
2. Software compilation results
|
|
|
3. Compiled versions of dependencies
|
|
|
4. The generated data
|
|
|
5. Calibration files for checking hardware devices
|
|
|
|
|
|
For these purposes of data storage and/or distribution to third parties, the data must be stored elsewhere, for example on a web space hosted on the https://files.inria.fr server (https://doc-si.inria.fr/display/SU/Espace+web).
|
|
|
|
|
|
Git LFS should not be used to back up documents either. To meet this need, a dedicated service has been deployed by the DSI: https://doc-si.inria.fr/display/SU/Sauvegardes#
|
|
|
|
|
|
|
|
|
### Deploying your own Git-LFS server
|
|
|
Reference documentation by Erwan Demairy: a minimal example where git-lfs data is stored on its own server.
|
|
|
https://gitlab.inria.fr/dgdi-sdt/pfo/gitlab-test-lfs
|
|
|
|
|
|
### Reference documentation on Git-LFS
|
|
|
https://docs.gitlab.com/ee/topics/git/lfs/
|
|
|
|
|
|
https://git-lfs.com/
|
|
|
|
|
|
https://docs.gitlab.com/ee/topics/git/lfs/#removing-objects-from-lfs
|
|
|
|
|
|
https://about.gitlab.com/blog/2017/01/30/getting-started-with-git-lfs-tutorial/
|
|
|
|
|
|
|
|
|
## Quota management policy to monitor disk space by project on Gitlab
|
|
|
Cf document[Politique_Gestion_Quotas_GitLab.pdf](uploads/c7b7c985df3bbc95e9851872bbb2c7e9/Politique_Gestion_Quotas_GitLab.pdf)
|
|
|
|
|
|
Setting quotas per project is a mechanism for encouraging users to regularly review the relevance of the data they store.
|
|
|
|
|
|
Quotas are a way of encouraging users to regularly review their use of disk space. The aim is not so much to save money, but rather to make sensible use of disk space, while allowing projects to handle large volumes of data if necessary.
|
|
|
|
|
|
The general recommendation is not to have Git repositories larger than 1 GB to preserve performance.
|
|
|
At present, each project has 20GB of storage space by default. As soon as 80% of disk space is used, repository owners will be warned and invited to clean up their files and reduce their size as much as possible.
|
|
|
|
|
|
Beyond this default blocking threshold, a dialogue with the DSI and the PFO team will be initiated, via a ticket to the GitLab team on the helpdesk, to assess whether the use of git (including git-lfs) generating a large use of disk space on community servers is a justified approach.
|
|
|
|
|
|
## How can I manage and reduce my disk space?
|
|
|
|
|
|
### When does the project disk space monitoring script run?
|
|
|
The script is a pre-receive hook based on the GitLab API, which is activated for certain operations on the GitLab server, such as commit, push, merge request, or branch and tag creations.
|
|
|
|
|
|
### What is taken into account when calculating the disk space occupied by my project?
|
|
|
A project's storage space is the sum of the size statistics of the versioned files in the repository, LFS objects, wiki, job artifacts, pipeline artifacts, release packages, snippets and uploads.
|
|
|
|
|
|
### How much space is my project taking up?
|
|
|
To find out, go to Settings -> Usage Quotas for your project.
|
|
|
|
|
|
### How much space is occupied by all my projects?
|
|
|
|
|
|
You can use [GraphQL](https://docs.gitlab.com/ee/api/graphql/) to find out how much space is occupied by each of your projects:
|
|
|
1. Connect to https://gitlab.inria.fr/-/graphql-explorer
|
|
|
2. Run the following query:
|
|
|
|
|
|
```graphql
|
|
|
query {
|
|
|
projects(membership: true, search: "", sort: "name_asc") {
|
|
|
nodes {
|
|
|
name
|
|
|
fullPath
|
|
|
statistics {
|
|
|
buildArtifactsSize
|
|
|
containerRegistrySize
|
|
|
lfsObjectsSize
|
|
|
packagesSize
|
|
|
pipelineArtifactsSize
|
|
|
repositorySize
|
|
|
snippetsSize
|
|
|
uploadsSize
|
|
|
wikiSize
|
|
|
}
|
|
|
}
|
|
|
pageInfo {
|
|
|
endCursor
|
|
|
hasNextPage
|
|
|
}
|
|
|
}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
The result of the query returns, for each project, the space occupied by the various components in bytes.
|
|
|
|
|
|
### How do I remove a large file or a binary added by mistake?
|
|
|
|
|
|
Here's how to rewrite the history, and in particular how to delete a commit or delete a file from each commit:
|
|
|
https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History
|
|
|
|
|
|
For non-trivial cases of rewriting, here's a versatile tool :
|
|
|
https://github.com/newren/git-filter-repo/
|
|
|
|
|
|
For specifically deleting objects from LFS :
|
|
|
https://docs.gitlab.com/ee/topics/git/lfs/#removing-objects-from-lfs
|
|
|
|
|
|
### How can I reduce the size of a repository?
|
|
|
|
|
|
It is always best to stay below the size limit, to avoid the project stalling.
|
|
|
|
|
|
This may involve destructive steps such as rewriting the history.
|
|
|
|
|
|
https://docs.gitlab.com/16.10/ee/user/project/repository/reducing_the_repo_size_using_git.html#repository-cleanup
|
|
|
|
|
|
### How can I reduce the various spaces occupied by a GitLab project?
|
|
|
|
|
|
https://docs.gitlab.com/ee/user/usage_quotas.html#manage-storage-usage
|
|
|
|
|
|
### Why doesn't disk space decrease despite deletions?
|
|
|
|
|
|
A grace period of 2 weeks preserves deleted objects. This can be shortened: https://docs.gitlab.com/ee/user/project/repository/reducing_the_repo_size_using_git.html#space-not-being-freed
|
|
|
|
|
|
### How do I delete the oldest or most recent artefacts?
|
|
|
By default, artefacts expire after 30 days.
|
|
|
|
|
|
As a bonus, in each project, artifacts from the most recent successful pipeline are stored and do not expire. This option can be deactivated at project level (Settings -> CI/CD -> Artifacts -> Keep artifacts from most recent successful jobs).
|
|
|
|
|
|
To reduce the time that artefacts are kept, or to delete the most recent artefacts, you can use the scripts in the gitlab-api project developed by Denis Arrivault :
|
|
|
|
|
|
https://gitlab.inria.fr/sdt-pfo/tools/gitlab-api
|
|
|
|
|
|
This project contains templates to help users in cleaning their artifacts and pipelines in their CI/CD scripts.:
|
|
|
|
|
|
https://gitlab.inria.fr/sdt-pfo/tools/gitlab-api-templates
|
|
|
|
|
|
## Quota management policy to monitor disk space by project on Gitlab
|
|
|
Cf document[Politique_Gestion_Quotas_GitLab.pdf](uploads/c7b7c985df3bbc95e9851872bbb2c7e9/Politique_Gestion_Quotas_GitLab.pdf)
|
... | ... | |