Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a "label" when rebasing interactively in Git

Tags:

git

git-rebase

When interactively rebasing, Git will open an editor with commands one can use. Three of these commands have to do with something called label.

# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.

What is this label and how can one use it?

like image 862
Luc Avatar asked Apr 08 '20 12:04

Luc


People also ask

What is rebase interactively in git?

Interactive rebase in Git is a tool that provides more manual control of your history revision process. When using interactive rebase, you will specify a point on your branch's history, and then you will be presented with a list of commits up until that point.

What is the syntax for rebasing in git?

Checkout to the desired branch you want to rebase. Now perform the rebase command as follows: Syntax: $git rebase <branch name>

How do I save a git rebase interactive?

Press esc to exit edit mode and type :wq to save the file. Note: If you made changes to the file that you do not want to save, type :q! to force quit. The interactive rebase will be applied. We see in the Git log that the order of the commits has changed.


1 Answers

chepner's comment is exactly right: labels are how git rebase --rebase-merges works. If you're not using --rebase-merges you don't need to know anything further.

Rebasing as a concept

Rebase in general works by copying commits, as if by git cherry-pick. This is because it's impossible to change any existing commit. When we use git rebase, what we want, in the end, is some set of changes—subtle or blatant being up to us—to some existing commits.

That's technically not possible at all, but if we look at how we (humans) use Git and find commits, it's easy after all. We don't change the commits at all! Instead, we copy them to new-and-improved commits, then use the new ones and forget (or abandon) the old ones.

Finding commits: drawing the graph

The way we use Git, and find commits, is to rely on the fact that each commit records the hash ID of its immediate predecessor or parent commit. This means that commits form backwards-looking chains:

... <-F <-G <-H   <--branch

The branch name branch holds the actual, raw hash ID of the last commit in the chain. In this case, whatever the actual commit's hash ID is, we draw it with the letter H as a stand-in.

Commit H contains, as part of its metadata, the raw hash ID of an earlier commit, which we call G. We say that H points to G, and that branch points to H.

Commit G of course points to its parent F, which points back yet further. So when we use Git, we start with a branch name that remembers, for both us and Git, the last commit in the chain. From there we have Git work backwards, one commit at a time, through the chain.

A merge commit is simply a commit with at least two parents, instead of the usual one. So a merge commit M looks like this:

...--J
      \
       M   <-- somebranch
      /
...--L

where J and L are the two parents of the merge. Typically (though not absolutely necessary), the histories first fork, then merge:

          I--J
         /    \
...--G--H      M--N--...
         \    /
          K--L

and we can call the I-J and K-L spurs a branch, or we can treat everything up to and including M and/or N as a single branch—there must, after all, be some branch name pointing to some commit towards the right. How else did we find commit M in the first place?

(We can, if we like, add branch names pointing to any commit at any time. Adding a branch name means that the commits are all now on an additional branch, over and above whichever branches they were on before. Deleting a name removes that branch from the set of branches that contain those commits.)

Walking backwards through a merge commit is tricky: Git has to start looking at both forks, here both the I-J and K-L forks. Git does this internally with git log and git rev-list using a priority queue, though we won't go into any details here.

Anyway, the key here is that because commits store parent hash IDs, and the arrows all point backwards, commits form a Directed Acyclic Graph or DAG. We—and Git—find commits using a branch name, which by definition points to the last commit in some part of the DAG. From there we have Git walk backwards.

Rebasing in a nutshell

Suppose we were to take some existing simple chain of commits such as A-B-C here:

...--o--o--*--o--o   <-- master
            \
             A--B--C   <-- branch (HEAD)

and copy them to new commits like this:

                   A'-B'-C'  <-- HEAD
                  /
...--o--o--*--o--o   <-- master
            \
             A--B--C   <-- branch

This uses Git's detached HEAD mode, where HEAD points directly to a commit. So the name branch still finds the original commits, while HEAD, detached now, finds the new copies. Without worrying too much about what exactly is different in the new copies, what if we now forced Git to move the name branch so that it points, not to C, but to C' instead? That is, in terms of the drawing, we'll do this:

                   A'-B'-C'  <-- branch (HEAD)
                  /
...--o--o--*--o--@   <-- master
            \
             A--B--C

Having moved branch, we also re-attach our HEAD so that we can be back in normal everyday Git mode, rather than in the middle of a rebase. And now, when we look for commits, we'll find the new copies, not the originals. The new copies are new: They have different hash IDs. If we actually remembered hash IDs, we'd see that ... but we find commits by starting at a branch name and working backwards, and when we do that, we've totally abandoned the originals and see only the new copies.

So that's how rebase works, in the absence of merges, anyway. Git:

  • lists out some commits to copy;
  • detaches HEAD to the place the copies should go;
  • copies the commits, as if by git cherry-pick (often actually with git cherry-pick), one at a time; and then
  • moves the branch name and re-attaches HEAD.

(There are a lot of corner cases here, such as: what happens if you start out with a detached HEAD, and what happens with merge conflicts. We'll just ignore all of those.)

A bit about cherry-pick

Above, I said:

Without worrying too much about what exactly is different in the new copies ...

What exactly is different? Well, a commit itself holds a snapshot of all of your files, plus metadata: the name and email address of whoever made the commit, the log message, and so on, and all-important for Git's DAG, the parent hash ID(s) of that commit. Since the new copies come after a different point—the old base was * and the new base is @—obviously the parent hash IDs had to change.

Given that adding a new commit works by setting the new commit's parent to the current commit, the updated parents happen automatically during the copying process, as we copy commits, one commit at a time. That is, first we check out commit @, then we copy A to A'. The parent of A' is @, automatically. Then we copy B to B' and the parent of B' is A', automatically. So there's no real magic here: this is just basic, everyday Git.

The snapshots are probably different too, though, and that's where git cherry-pick really comes in. Cherry-pick has to view each commit as some set of changes. To view a commit as a change, we must compare the commit's snapshot against the commit's parent's snapshot.

That is, given:

...--G--H--...

we can see what changed in H by first extracting G to a temporary area, then extracting H to a temporary area, then comparing the two temporary areas. For files that are the same, we say nothing at all; for files that are different, we produce a diff listing. That tells us what changed in H.

So for git cherry-pick to copy a commit, it just has to turn the commit into changes. That requires looking at the commit's parent. For commits A-B-C, that's no problem: the parent of A is *; the parent of B is A; and the parent of C is B. Git can find the first set of changes—* vs A—and apply the changes to the snapshot in @, and make A' that way. Then it finds the A-vs-B changes and applies those to A' to make B, and so on.

This works fine for ordinary, single-parent commits. It does not work at all for merge commits.

Copying a merge is impossible, so rebase doesn't try

Suppose we have a set of commits with a merge bubble, and this set of commits itself could be rebased:

           I--J
          /    \
         H      M   <-- feature (HEAD)
        / \    /
       /   K--L
      /
...--G-------N--O--P   <-- mainline

We might like to git rebase the feature commits atop commit P now. If we do, the default result is either:

...--G-------N--O--P   <-- mainline
                    \
                     H'-I'-J'-K'-L'  <-- feature (HEAD)

or:

...--G-------N--O--P   <-- mainline
                    \
                     H'-K'-L'-I'-J'  <-- feature (HEAD)

(I did not bother drawing in the abandoned commits, to save space.)

It is up to git rev-list to pick an order for I-J and K-L during the list-commits-to-copy part of the rebasing process. Commit M, the merge, is simply dropped: the two branches that resulted in merge commit M are flattened into one simple linear chain. This avoids the need to copy commit M, at the expense of sometimes not being able to copy the commits very well (having a lot of merge conflicts) and of course destroying our nice little merge bubble, if we wanted to keep it.

Cherry-pick cannot copy a merge ...

While you can run git cherry-pick on a merge commit, the resulting commit is an ordinary, non-merge commit. Furthermore, you must tell Git which parent to use. Cherry-picking fundamentally has to diff the commit's parent vs the commit, but a merge has two parents, and Git simply does not know which of the two to use. You must tell it which one ... and then it copies the changes found by the diff, which is not what git merge is all about.

... so to rebase and keep merges, git rebase re-performs the merges

What this all means for git rebase is that in order to "preserve" a merge, Git has to run git merge itself.

That is, suppose we are given:

           I--J
          /    \
         H      M   <-- feature (HEAD)
        / \    /
       /   K--L
      /
...--G-------N--O--P   <-- mainline

and we want to achieve:

                         I'-J'
                        /    \
                       H'     M'  <-- feature (HEAD)
                      / \    /
                     /   K'-L'
                    /
...--G-------N--O--P   <-- mainline

Git's rebase can do this, but to do it, it must:

  • copy H to H' and drop a marker here;
  • pick one of I or K to copy to I' or K', then copy either J or L next; let's say we pick I-J to do;
  • drop a marker pointing to J';
  • git checkout the H' copy it made earlier using the marker;
  • copy K and L now, to K' and L', and drop a marker here

so that as our intermediate result-so-far, we have:

                         I'-J'   <-- marker2
                        /
                       H'  <-- marker1
                      / \
                     /   K'-L'   <-- marker3
                    /
...--G-------N--O--P   <-- mainline

Git can now git checkout commit J' using marker 2, run git merge on commit L' using marker 3, and thereby produce commit M', a new merge that used H' as its merge base and J' and L' as its two branch-tip commits.

Once the merge is done, the rebase-as-a-whole is done, and Git can delete the markers and yank the branch name feature over as usual.

If we're a little clever, we can let HEAD act as one of the three markers sometimes, but it's more straightforward to just drop markers each time. I'm not sure offhand which technique git rebase --rebase-merges actually uses.

The label, reset, and merge commands create and use the various markers. The merge command requires that HEAD point to the commit that will be the first parent of the resulting merge (since git merge works that way). It's interesting that the syntax suggests that octopus merges are forbidden here: they should Just Work and hence should be allowed.

(The -C in the merge command can use the raw hash ID of the original merge commit, since that's always unchanged. The labels you'll see, if you use --rebase-merges with a set of commits that contains merges, are generated by Git from the commit messages, and until quite recently there was a bug here.)

Side note: evil merges and --ours merges don't survive

When Git re-performs a merge, it just uses the regular merge engine. Git does not know about any flags used during the merge, or any changes introduced as an "evil merge". So -X ours, or --ours, or extra changes just get lost during this kind of rebase. Of course, if the merge has merge conflicts, you get a chance to re-insert evil-merge changes, or redo the merge entirely however you like.

See also Evil merges in git?

like image 102
torek Avatar answered Oct 20 '22 06:10

torek