gforge-to-gitlab-scripts
Some scripts to migrate content (trackers) from gforge to gitlab
migrate-trackers
Requirements
You need the python3 bindings for the lxml library, available for example
from the python3-lxml
package on debian or (I guess) ubuntu.
What you can expect from this script
This script imports the content of the trackers from a gforge project:
- initial description,
- messages,
- attachments, and imports them to a gitlab project.
All imported issues will have the "imported-from-gforge" label, plus custom labels as you see fit. Issue "importance" in gforge also gets mapped to various labels.
The current version of this script only works with public gforge projects (but it shouldn't be too hard to extend to projects that require authentication).
As this script requires write access to the gitlab server, several conditions must be met. First off, you must obviously know which project you want to modify, and have write access to it.
The authentication material that you provide to the script is an access token (more on this below). This access token is attached to an account, or to a bot account. Depending on that account's access level to the target project, the script can do various things:
-
the bare minimum situation is that of maintainer-level access to the project. In this situation, you will not be able to retain the issue IDs or message dates. The apparent poster of all issues and all issue messages will be the account attached to the access token. Note that since standard bot account have maintainer level permissions, it is possible to use such an account so that all messages appear as posted by a bot.
-
if you have owner-level access to the project, you can keep the issue IDs and message dates. The apparent poster of all issues and all issue
messages will be the account attached to the access token. It is currently NOT possible for bot accounts to have owner-level permissions, so bot accounts cannot be used here (at least not until gitlab issue 214046 is fixed). -
It does not seem possible to do better. See this longer note.
Note that gitlab gets very chatty as this script is being used, and sends lots and lots of e-mails under some circumstances (sent by the owner of the access token). There are several provisions to mitigate this, but you (and the members of the project you're pushing to) must be aware of it.
This script is provided on a best-effort basis, and is quite experimental. If it doesn't suit your needs, do not hesitate to adapt it. You can also post bug reports as gitlab issues on this page.
Usage
- find the numerical group id of your gforge project (this appears as
group_id
in the url that prints for example the tracker tab in the gforge project). - decide on a destination project on gitlab, and record its project path,
typically
some_user_or_organization/some_project_name
. The project numerical id also works (displayed as "Project ID: xxxxx" on the project main page). - decide on the access level you want to use (see the section above).
- maybe edit the
userdatabase.json
file (see below) to map long names (found on gforge) to known gitlab user names. It is not strictly necessary, but your created issues look better with it. The script will warn you if it finds unmatched users, so you might as well decide to take action based on these warnings as issued by your first trial runs on the script. - get a personalAccessToken (from https://gitlab.yourdomainname.yourtld/profile/personal_access_tokens)
Then use the script as follows:
Usage: CreateGitlabIssuesFromGforge.py [options]
Recognized options:
-t, --accesstoken <gitlabPersonalAccessToken>
-l, --labels <comma separated list of text label to add to gitlab issue> (may be given several times)
-p, --gitlabprojectname <projectname (path) or numerical project id>
-S, --gforgeserver <gforge server url, e.g. https://gforge.example.com/>
-G, --gforgegroupid <numeric gforge project id>
-g, --gitlabserver <gitlab api url, e.g. https://gitlab.example.com/api/v4>
-i, --issues <comma separated list of gforge issue ids>
-u, --userdatabase <path to json file with user matching table>
--stealth temporarily remove project members while doing batch
operation, so as to avoir mass e-mails (see [there](#who-receives-tons-of-e-mails-))
-d, --debug print calls to the gitlab api
-w, --write DO THE ACTIONS FOR REAL (otherwise, just test)
The "project name (path) or numerical project id", for example, can be "joe/sudoku" or "1234"
Beware: as a side-effect, this script sends e-mails to gitlab accounts, under some conditions that are listed further down in this file.
Best practices
- Always use
-i
on a sample issue first (or a comma-separated list of issues) - Please fill in a file like
userdatabase.json
and pass it with-u
(see below). Rerun the script until you're satisfied with your user database file. - Place (at least temporarily) your project in a private namespace that
is owned only by you, so that you don't have inherited project members.
You can always move the project to the right location after that. This
allows you to use the
--stealth
argument, and reduce the number of e-mail notifications that are triggered by the script (see this discussion). - Once you've followed these best practices, we can safely inform you
that the
-w
option is necessary if you want the script to actually do something.
The user matching database
The script must know how to match gforge names to gitlab logins. This
works through a map that you have to write. An example map is given in
the userdatabase.json
file.
The user database is a list of known matches, given as hashes. Hashes MUST have the three fields below:
gforge_login
gforge_name
gitlab_login
Incomplete records will trigger errors.
It is also possible to specify the three fields above in an array, in exactly the order above. The code translates to a hash anyway.
Note that for gitlab external users, because of this gitlab
issue, we cannot
obtain the gitlab_login
to gitlab_uid
mapping, and this is an
annoyance to the script. There is a way around, but it cannot work
with the API, at least not in all cases (we have an automatic
workaround for users who put a custom profile picture, that's it).
The way around is as follows, assuming the gitlab login is johndoe
:
- navigate to https://gitlab.yourdomainname.yourtld/johndoe
- if you're not signed in, then please sign in (this should bring you back to the same page).
- view the source, and search for the string:
abuse_reports/new?user_id=
- the gitlab uid you're looking for is right here.
Once you've successfully found the gitlab uid, you can simply add an
entry for your external user in your user matching database. The
userdatabase.json
file has an
example that shows how this can be done.
Limitations:
- cannot set the reporter (gitlab API limitation cf. https://gitlab.com/gitlab-org/gitlab/issues/16140)
=> as a workaround, the description is prefixed with a message indicating the original tracker issue and the reporter. Additionally, the script can be run using a "bot" account accessToken (see above).
Why does sudo mode fall short
For gitlab administrators, the sudo api is apparently an attractive way to address the shortcoming above. Unfortunately, it doesn't work too well for a variety of reasons:
- obviously, for private projects, only project members can post issues and comment on issues via the api
- if the script uses the sudo functionality to masquerade as the originating user when creating an issue or commenting on an issue, then it loses the power to modify the date of the issue or comment. And it is not possible to change the modification date after the fact.
We consider that these shortcomings have more cons that pros. Therefore,
while most of the necessary plumbing is here in the code, we don't use
sudo in this script. If you believe otherwise, you may tinker with the
can_impersonate()
function in
CreateGitlabIssuesFromGforge.py
.
As of gitlab 13.6, we don't know how to fix this. Future versions of the gitlab api might allow more things.
Who receives tons of e-mails ?
Since issues contain reference to people, including so-called @-mentions which are meant to trigger notifications, the natural question is to ask who gets tons of e-mail if you create a large number of gitlab issues (which is what this script is all about).
We try to list the conditions under which e-mails are sent. The list of criteria below is based on experimental observation. There is some possibility that it is inaccurate, do not hesitate to fix it.
Obviously, accounts that are not known to gitlab (and which do not end up formatted as @-references either) have no attached e-mail address that gitlab knows about. These people do not get notification e-mails.
Accounts that are not members of the project being modified do not receive e-mail on issue creation either, even if they're @-mentioned.
The project owner who modifies the project does not receive automatic notification e-mails, except if he/she chose to tick the box "receive e-mails about my own activity" in the gitlab notifications setting.
Accounts that are members of the project being modified, and have a notification level of 'On mention' will (expectedly) receive notifications for every issue post in which they're @-mentioned PLUS posts that they authored, because the script includes an explicit @-mention of the poster.
If you used the --stealth
option, the situation should be a lot better,
as the script then tries to temporarily revoke project membership for all
users while it does its job (and it restores them once it's all done).
This means that the only e-mail that users will get will be "Access to the
project was granted". Unfortunately, it's not perfect: members of the
enclosing group have inherited membership, and the script does not cope
well with that. Specifically, we do not wish to open a can of worms by
tinkering with the membership at the group level.