Add sections about cherrypick and bisect

57c4ce03 · MALANDAIN Mathias · 6912a1c8 · 57c4ce03 · 57c4ce03
Commit 57c4ce03 authored 1 year ago by MALANDAIN Mathias
--- a/08-advanced.md
+++ b/08-advanced.md
@@ -143,7 +143,7 @@ Speaking of which, of course:
 Just like good ol' rebasing, interactive rebasing rewrites history. Hence, **do NOT use interactive rebasing on a branch that was already pushed on the remote**.

 The basic course of action is as follows:
-* See where you want to start rebasing: you can get the short hash of the commit *just before* the sequence of commits you want to change (e.g., `4f29ed1`), or count the number of commits you want to take into account, in which case the `HEAD~<number_of_commits>` shortcut will come in handy (e.g., `HEAD~5` if you want to change the last 5 commits), or even manipulate the whole branch by using the name of the branch it was branched off (e.g., `main`).
+* See where you want to start rebasing: you can get, with commands such as `git log`, the short hash of the commit *just before* the sequence of commits you want to change (e.g., `4f29ed1`), or count the number of commits you want to take into account, in which case the `HEAD~<number_of_commits>` shortcut will come in handy (e.g., `HEAD~5` if you want to change the last 5 commits), or even manipulate the whole branch by using the name of the branch it was branched off (e.g., `main`).
  * In the latter case, **make sure that you just rebased!** Otherwise, you will be trying to do two things at the same time (integrating fresh changes from `main` into your branch *and* manipulating your local history), and I absolutely do not recommend trying this.
 * Launch the interactive rebase session (`git rebase -i 4f29ed1` or `git rebase -i HEAD~5` or `git rebase -i main` in our examples).
 * A text file appears, listing the commits and prefixing them with a short description of the operation to be performed on each commit. Every commit is initially prefixed with keyword `pick` (basically, keep the commit exactly as it is). Below this list is a long comment telling you what keywords can be used. Change the file as you want (you can also move lines up and down to reorder commits).
@@ -155,17 +155,72 @@ Also, you should definitely play with `git rebase -i` on a dummy project. You mi

 ## Cherrypicking (how to move commits around) {#cherrypick}

-{: .box-error}
-**TODO:** Cherrypicking
+Cherrypicking is basically how one can take a commit on a given branch and apply the same changes on another branch. This absolutely does not look like something anyone would like to use in an actual project, right? If I want to move some code from one branch to another, I would like to either merge, or rebase!

-# Searching for bugs with `git-bisect` {#git-bisect}
+Well, yes and no. Here is a good use-case of cherrypicking: a hotfix was applied in the `release` branch of a project, and I want to move it back to the main development branch, but I do not want to merge `release` into `main` (maybe because the other commits on the `release` branch are just a bunch of patches in a user interface that is currently being redesigned from scratch, or because I work with people who regard merging `release` into `main` as malpractice).

-{: .box-error}
-**TODO**
+Here is another one: someone just fixed a bug on the development branch and I want to use the corresponding commit to patch the latest release of the project. However, a bunch of other stuff has been changed on `main` since then, so that I just cannot merge `main` into the release branch.
+
+Finally, to make things clear, this is a case in which cherrypicking is actually a bad idea: a feature that you are currently developing in a fresh branch (that was, say, branched off `main`) was already fully implemented by someone else, in several commits, on another branch `feature`. Even if you cherrypick all these commits at once, this will very likely yield merge conflicts, because cherrypicking is actually merging! What you probably want to do instead is go back on `main`, create a fresh branch from there, and merge `feature` in there.
+
+Now, for more practical information:
+
+{: .box-success}
+**Cherrypicking** consists in picking the changes that a commit applied to the codebase on one branch, and applying these changes on another branch.
+
+This is a more specific definition than the arguably simpler "applying a commit on another branch". The reason is that, in the Git terminology, a commit is an object that not only contains a description of the changes (the *diff*) to be applied, but also information about the author, time of creation, etc. as well as a hash that serves as an identifier of this commit for future purposes. As such, copy-pasting a commit would be weird: its metadata would be wrong, and its hash would be shared with the original commit, which defeats the purposes of a hash.
+
+Hence, cherrypicking applies, to the current branch, the same changes that were applied by another commit on another branch, and "packages" them into a fresh commit with its own metadata and its own hash.
+
+Cherrypicking is pretty simple:
+* Find the hash of the commit that you want to "copy" on top of another branch.
+* Switch to the branch in question.
+* Type `git cherry-pick <commit_hash>` (or `git cherry-pick -x <commit_hash>`, see below).
+* Profit.
+
+Adding option `-x` to the `git cherry-pick` command will automatically append `(cherry picked from commit <commit_hash>)` to the message of the fresh commit, which makes it easier to track where the changes come from. Oh, wait, did I write "easier"? I meant "possible in the first place": if this option is not given, the new commit will only appear as a standard commit, created *ex nihilo*, that just happens to have the same message as a commit from another branch. If you want even the slightest bit of traceability, you should definitely use this option.
+
+"Wait, that was it?" Yup, pretty much.
+
+# Searching for bugs with `git bisect` {#git-bisect}
+
+Okay, let us delve into some pretty depressing scenarios.
+
+* Imagine that you just implemented a new test for your software, and it fails miserably. The problem is that this test focuses on a feature that was introduced a long time ago.
+* Or maybe you already had this test but forgot to run it for a pretty long time (...maybe you should think about implementing a [CI pipeline]({{'/05-good-practices#cicd' | relative_url }})). Like, you think you remember it passed last time you ran the whole test suite, in March of last year maybe?
+* Or this test only runs on the versions that are tagged for release, so that you know that this test passed on the latest release, but fails on the one you are currently preparing.
+
+In any case, the current version of the codebase is "bad", the last "good" version was a bazillion commits ago, and these commits introduced changes in thousands of lines of code across tens of files.
+
+You try and use your favorite debugger for fixing the issue, but you just end up trying to understand the logics of tens or hundreds of methods that you either did not write, or wrote eons ago. The situation looks dire.
+
+Well, what if I told you that you may be able to restrict the scope of your investigation to a few tens of lines?
+
+{: .box-success}
+**`git bisect`** is a simple implementation of binary search (in Baguette: "recherche par dichotomie") that helps you pinpoint the commit that introduced a bug.
+
+It is actually pretty simple to use. In our case, we first want to save our test outside of our working copy. Once this is done, find a commit hash, a tag, anything that can identify a commit that you know did not "feature" the bug (maybe the one just before the first implementation of the method that your failing test focuses on, or maybe an old version in which the test actually passes if you are dealing with regression): let us denote it by `<good_commit>` (it could be something like `v2.1`, or `HEAD~100`, for instance`).
+
+The magical command is
+
+```sh
+git bisect start HEAD <good_commit>
+```
+
+This command states that you want to start searching for the forst "bad" commit, and you know that `HEAD` (the current state of your codebase) is "bad" and `<good_commit>` is "good". The search session is initiated with Git pulling a specific commit, namely, the one that is "right in the middle" of the range of commits in which you are searching for the bug. If `<good_commit>` is `HEAD~100`, for instance, Git will pull the commit referenced by `HEAD~50`.
+
+You can test this version of the codebase and see whether it works or not. If it does, type `git bisect good`; if it doesn't, `git bisect bad`. You just cut the search space roughly in half: Git tells you how many commits are left in the range of commits to assess, and how many more steps of this game are left before you catch the culprit. It also jumps to another commit, right in the middle of the remaining range, so that you can now test this one.
+
+Lather, rinse, repeat, until Git provides you with the description of the first bad commit. Now you may just inspect this specific commit (just click on it in the "Repository graph" on the webpage of your GitLab project and you will get a nice coloured diff that shows you all changes it introduced, and only those changes). This might make debugging way simpler!
+
+{: .box-warning}
+Do not forget to type `git bisect reset` once this is over, or if you get tired of this game before it ends; Git will clean everything up and return your copy to its state just before you started the binary search. Lovely.
+
+For more information and options, [the Git SCM website is there for you](https://git-scm.com/docs/git-bisect).

 # Forks and pull requests {#fork}

-I was recently told by a friend of mine that, in the company he is working for, MRs for branches basically do not exist; instead, they rely on MRs for forks. For advanced and experienced users, there are indeed aadvantages to this workflow.
+I was recently told by a friend of mine that, in the company he is working for, MRs for branches basically do not exist; instead, they rely on MRs for *forks*. For advanced and experienced users, there are indeed advantages to this workflow.

 More directly, this is what you will be doing if you are contributing to open source projects, either because you believe in FOSS in general, or just because you *had to* implement new features in an existing software, for your personal use, and have nothing against making these features available to other users if they face the same need.


--- a/index.md
+++ b/index.md
@@ -90,8 +90,8 @@ If you are already somewhat accustomed to Git and GitLab, you may also directly
  * [Juggling with branches]({{'/08-advanced#juggling' | relative_url }})
    * [Rebasing (how to keep up with a source branch)]({{'/08-advanced#rebase' | relative_url }})
    * [Interactive rebasing (how to clean up your mess before pushing)]({{'/08-advanced#irebase' | relative_url }})
-    * [**(WIP)** Cherrypicking (how to move commits around)]({{'/08-advanced#cherrypick' | relative_url }})
-  * [**(WIP)** Searching for bugs with `git-bisect`]({{'/08-advanced#git-bisect' | relative_url }})
+    * [Cherrypicking (how to move commits around)]({{'/08-advanced#cherrypick' | relative_url }})
+  * [Searching for bugs with `git-bisect`]({{'/08-advanced#git-bisect' | relative_url }})
  * [Forks and pull requests]({{'/08-advanced#fork' | relative_url }})
  * [CI pipelines: `.gitlab-ci.yml` examples]({{'/08-advanced#gitlab-ci' | relative_url }})
  * [**(WIP)** How to fix a commit to the wrong branch?]({{'/08-advanced#wrong-branch' | relative_url }})