Mentions légales du service

Skip to content
Snippets Groups Projects
Name Last commit Last update
migrate-trackers/src
README.md

gforge-to-gitlab-scripts

Some scripts to migrate content (trackers) from gforge to gitlab

migrate-trackers

Requirements

You need the python3 bindings for the lxml library, available for example from the python3-lxml package on debian or (I guess) ubuntu.

What you can expect from this script

This script imports the content of the trackers from a gforge project:

  • initial description,
  • messages,
  • attachments, and imports them to a gitlab project.

All imported issues will have the "imported-from-gforge" label, plus custom labels as you see fit. Issue "importance" in gforge also gets mapped to various labels.

The current version of this script only works with public gforge projects (but it shouldn't be too hard to extend to projects that require authentication).

As this script requires write access to the gitlab server, several conditions must be met. First off, you must obviously know which project you want to modify, and have write access to it.

The authentication material that you provide to the script is an access token (more on this below). This access token is attached to an account, or to a bot account. Depending on that account's access level to the target project, the script can do various things:

  • the bare minimum situation is that of maintainer-level access to the project. In this situation, you will not be able to retain the issue IDs or message dates. The apparent poster of all issues and all issue messages will be the account attached to the access token. Note that since standard bot account have maintainer level permissions, it is possible to use such an account so that all messages appear as posted by a bot.

  • if you have owner-level access to the project, you can keep the issue IDs and message dates. The apparent poster of all issues and all issue
    messages will be the account attached to the access token. It is currently NOT possible for bot accounts to have owner-level permissions, so bot accounts cannot be used here (at least not until gitlab issue 214046 is fixed).

  • It does not seem possible to do better. See this longer note.

Note that gitlab gets very chatty as this script is being used, and sends lots and lots of e-mails under some circumstances (sent by the owner of the access token). There are several provisions to mitigate this, but you (and the members of the project you're pushing to) must be aware of it.

This script is provided on a best-effort basis, and is quite experimental. If it doesn't suit your needs, do not hesitate to adapt it. You can also post bug reports as gitlab issues on this page.

Usage

  • find the numerical group id of your gforge project (this appears as group_id in the url that prints for example the tracker tab in the gforge project).
  • decide on a destination project on gitlab, and record its project path, typically some_user_or_organization/some_project_name. The project numerical id also works (displayed as "Project ID: xxxxx" on the project main page).
  • decide on the access level you want to use (see the section above).
  • maybe edit the userdatabase.json file (see below) to map long names (found on gforge) to known gitlab user names. It is not strictly necessary, but your created issues look better with it. The script will warn you if it finds unmatched users, so you might as well decide to take action based on these warnings as issued by your first trial runs on the script.
  • get a personalAccessToken (from https://gitlab.yourdomainname.yourtld/profile/personal_access_tokens)

Then use the script as follows:

    Usage: CreateGitlabIssuesFromGforge.py [options]

    Recognized options:
        -t, --accesstoken <gitlabPersonalAccessToken>
        -l, --labels <comma separated list of text label to add to gitlab issue> (may be given several times)
        -p, --gitlabprojectname <projectname (path) or numerical project id>
        -S, --gforgeserver <gforge server url, e.g. https://gforge.example.com/>
        -G, --gforgegroupid <numeric gforge project id>
        -g, --gitlabserver <gitlab api url, e.g. https://gitlab.example.com/api/v4>
        -i, --issues <comma separated list of gforge issue ids>
        -u, --userdatabase <path to json file with user matching table>
        --stealth temporarily remove project members while doing batch
        operation, so as to avoir mass e-mails (see [there](#who-receives-tons-of-e-mails-))
        -d, --debug   print calls to the gitlab api
        -w, --write   DO THE ACTIONS FOR REAL (otherwise, just test)

The "project name (path) or numerical project id", for example, can be "joe/sudoku" or "1234"

Beware: as a side-effect, this script sends e-mails to gitlab accounts, under some conditions that are listed further down in this file.

Best practices

  • Always use -i on a sample issue first (or a comma-separated list of issues)
  • Please fill in a file like userdatabase.json and pass it with -u (see below). Rerun the script until you're satisfied with your user database file.
  • Place (at least temporarily) your project in a private namespace that is owned only by you, so that you don't have inherited project members. You can always move the project to the right location after that. This allows you to use the --stealth argument, and reduce the number of e-mail notifications that are triggered by the script (see this discussion).
  • Once you've followed these best practices, we can safely inform you that the -w option is necessary if you want the script to actually do something.

The user matching database

The script must know how to match gforge names to gitlab logins. This works through a map that you have to write. An example map is given in the userdatabase.json file.

The user database is a list of known matches, given as hashes. Hashes MUST have the three fields below:

  • gforge_login
  • gforge_name
  • gitlab_login

Incomplete records will trigger errors.

It is also possible to specify the three fields above in an array, in exactly the order above. The code translates to a hash anyway.

Note that for gitlab external users, because of this gitlab issue, we cannot obtain the gitlab_login to gitlab_uid mapping, and this is an annoyance to the script. There is a way around, but it cannot work with the API, at least not in all cases (we have an automatic workaround for users who put a custom profile picture, that's it).

The way around is as follows, assuming the gitlab login is johndoe:

  • navigate to https://gitlab.yourdomainname.yourtld/johndoe
  • if you're not signed in, then please sign in (this should bring you back to the same page).
  • view the source, and search for the string: abuse_reports/new?user_id=
  • the gitlab uid you're looking for is right here.

Once you've successfully found the gitlab uid, you can simply add an entry for your external user in your user matching database. The userdatabase.json file has an example that shows how this can be done.

Limitations:

=> as a workaround, the description is prefixed with a message indicating the original tracker issue and the reporter. Additionally, the script can be run using a "bot" account accessToken (see above).

Why does sudo mode fall short

For gitlab administrators, the sudo api is apparently an attractive way to address the shortcoming above. Unfortunately, it doesn't work too well for a variety of reasons:

  • obviously, for private projects, only project members can post issues and comment on issues via the api
  • if the script uses the sudo functionality to masquerade as the originating user when creating an issue or commenting on an issue, then it loses the power to modify the date of the issue or comment. And it is not possible to change the modification date after the fact.

We consider that these shortcomings have more cons that pros. Therefore, while most of the necessary plumbing is here in the code, we don't use sudo in this script. If you believe otherwise, you may tinker with the can_impersonate() function in CreateGitlabIssuesFromGforge.py.

As of gitlab 13.6, we don't know how to fix this. Future versions of the gitlab api might allow more things.

Who receives tons of e-mails ?

Since issues contain reference to people, including so-called @-mentions which are meant to trigger notifications, the natural question is to ask who gets tons of e-mail if you create a large number of gitlab issues (which is what this script is all about).

We try to list the conditions under which e-mails are sent. The list of criteria below is based on experimental observation. There is some possibility that it is inaccurate, do not hesitate to fix it.

Obviously, accounts that are not known to gitlab (and which do not end up formatted as @-references either) have no attached e-mail address that gitlab knows about. These people do not get notification e-mails.

Accounts that are not members of the project being modified do not receive e-mail on issue creation either, even if they're @-mentioned.

The project owner who modifies the project does not receive automatic notification e-mails, except if he/she chose to tick the box "receive e-mails about my own activity" in the gitlab notifications setting.

Accounts that are members of the project being modified, and have a notification level of 'On mention' will (expectedly) receive notifications for every issue post in which they're @-mentioned PLUS posts that they authored, because the script includes an explicit @-mention of the poster.

If you used the --stealth option, the situation should be a lot better, as the script then tries to temporarily revoke project membership for all users while it does its job (and it restores them once it's all done). This means that the only e-mail that users will get will be "Access to the project was granted". Unfortunately, it's not perfect: members of the enclosing group have inherited membership, and the script does not cope well with that. Specifically, we do not wish to open a can of worms by tinkering with the membership at the group level.

Dev notes: