When interactively rebasing, Git will open an editor with commands one can use. Three of these commands have to do with something called label
.
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# . create a merge commit using the original merge commit's
# . message (or the oneline, if no original merge commit was
# . specified). Use -c <commit> to reword the commit message.
What is this label
and how can one use it?
Interactive rebase in Git is a tool that provides more manual control of your history revision process. When using interactive rebase, you will specify a point on your branch's history, and then you will be presented with a list of commits up until that point.
Checkout to the desired branch you want to rebase. Now perform the rebase command as follows: Syntax: $git rebase <branch name>
Press esc to exit edit mode and type :wq to save the file. Note: If you made changes to the file that you do not want to save, type :q! to force quit. The interactive rebase will be applied. We see in the Git log that the order of the commits has changed.
chepner's comment is exactly right: labels are how git rebase --rebase-merges
works. If you're not using --rebase-merges
you don't need to know anything further.
Rebase in general works by copying commits, as if by git cherry-pick
. This is because it's impossible to change any existing commit. When we use git rebase
, what we want, in the end, is some set of changes—subtle or blatant being up to us—to some existing commits.
That's technically not possible at all, but if we look at how we (humans) use Git and find commits, it's easy after all. We don't change the commits at all! Instead, we copy them to new-and-improved commits, then use the new ones and forget (or abandon) the old ones.
The way we use Git, and find commits, is to rely on the fact that each commit records the hash ID of its immediate predecessor or parent commit. This means that commits form backwards-looking chains:
... <-F <-G <-H <--branch
The branch name branch
holds the actual, raw hash ID of the last commit in the chain. In this case, whatever the actual commit's hash ID is, we draw it with the letter H
as a stand-in.
Commit H
contains, as part of its metadata, the raw hash ID of an earlier commit, which we call G
. We say that H
points to G
, and that branch
points to H
.
Commit G
of course points to its parent F
, which points back yet further. So when we use Git, we start with a branch name that remembers, for both us and Git, the last commit in the chain. From there we have Git work backwards, one commit at a time, through the chain.
A merge commit is simply a commit with at least two parents, instead of the usual one. So a merge commit M
looks like this:
...--J
\
M <-- somebranch
/
...--L
where J
and L
are the two parents of the merge. Typically (though not absolutely necessary), the histories first fork, then merge:
I--J
/ \
...--G--H M--N--...
\ /
K--L
and we can call the I-J
and K-L
spurs a branch, or we can treat everything up to and including M
and/or N
as a single branch—there must, after all, be some branch name pointing to some commit towards the right. How else did we find commit M
in the first place?
(We can, if we like, add branch names pointing to any commit at any time. Adding a branch name means that the commits are all now on an additional branch, over and above whichever branches they were on before. Deleting a name removes that branch from the set of branches that contain those commits.)
Walking backwards through a merge commit is tricky: Git has to start looking at both forks, here both the I-J
and K-L
forks. Git does this internally with git log
and git rev-list
using a priority queue, though we won't go into any details here.
Anyway, the key here is that because commits store parent hash IDs, and the arrows all point backwards, commits form a Directed Acyclic Graph or DAG. We—and Git—find commits using a branch name, which by definition points to the last commit in some part of the DAG. From there we have Git walk backwards.
Suppose we were to take some existing simple chain of commits such as A-B-C
here:
...--o--o--*--o--o <-- master
\
A--B--C <-- branch (HEAD)
and copy them to new commits like this:
A'-B'-C' <-- HEAD
/
...--o--o--*--o--o <-- master
\
A--B--C <-- branch
This uses Git's detached HEAD mode, where HEAD
points directly to a commit. So the name branch
still finds the original commits, while HEAD
, detached now, finds the new copies. Without worrying too much about what exactly is different in the new copies, what if we now forced Git to move the name branch
so that it points, not to C
, but to C'
instead? That is, in terms of the drawing, we'll do this:
A'-B'-C' <-- branch (HEAD)
/
...--o--o--*--o--@ <-- master
\
A--B--C
Having moved branch
, we also re-attach our HEAD
so that we can be back in normal everyday Git mode, rather than in the middle of a rebase. And now, when we look for commits, we'll find the new copies, not the originals. The new copies are new: They have different hash IDs. If we actually remembered hash IDs, we'd see that ... but we find commits by starting at a branch name and working backwards, and when we do that, we've totally abandoned the originals and see only the new copies.
So that's how rebase works, in the absence of merges, anyway. Git:
HEAD
to the place the copies should go;git cherry-pick
(often actually with git cherry-pick
), one at a time; and thenHEAD
.(There are a lot of corner cases here, such as: what happens if you start out with a detached HEAD, and what happens with merge conflicts. We'll just ignore all of those.)
Above, I said:
Without worrying too much about what exactly is different in the new copies ...
What exactly is different? Well, a commit itself holds a snapshot of all of your files, plus metadata: the name and email address of whoever made the commit, the log message, and so on, and all-important for Git's DAG, the parent hash ID(s) of that commit. Since the new copies come after a different point—the old base was *
and the new base is @
—obviously the parent hash IDs had to change.
Given that adding a new commit works by setting the new commit's parent to the current commit, the updated parents happen automatically during the copying process, as we copy commits, one commit at a time. That is, first we check out commit @
, then we copy A
to A'
. The parent of A'
is @
, automatically. Then we copy B
to B'
and the parent of B'
is A'
, automatically. So there's no real magic here: this is just basic, everyday Git.
The snapshots are probably different too, though, and that's where git cherry-pick
really comes in. Cherry-pick has to view each commit as some set of changes. To view a commit as a change, we must compare the commit's snapshot against the commit's parent's snapshot.
That is, given:
...--G--H--...
we can see what changed in H
by first extracting G
to a temporary area, then extracting H
to a temporary area, then comparing the two temporary areas. For files that are the same, we say nothing at all; for files that are different, we produce a diff listing. That tells us what changed in H
.
So for git cherry-pick
to copy a commit, it just has to turn the commit into changes. That requires looking at the commit's parent. For commits A-B-C
, that's no problem: the parent of A
is *
; the parent of B
is A
; and the parent of C
is B
. Git can find the first set of changes—*
vs A
—and apply the changes to the snapshot in @
, and make A'
that way. Then it finds the A
-vs-B
changes and applies those to A'
to make B
, and so on.
This works fine for ordinary, single-parent commits. It does not work at all for merge commits.
Suppose we have a set of commits with a merge bubble, and this set of commits itself could be rebased:
I--J
/ \
H M <-- feature (HEAD)
/ \ /
/ K--L
/
...--G-------N--O--P <-- mainline
We might like to git rebase
the feature
commits atop commit P
now. If we do, the default result is either:
...--G-------N--O--P <-- mainline
\
H'-I'-J'-K'-L' <-- feature (HEAD)
or:
...--G-------N--O--P <-- mainline
\
H'-K'-L'-I'-J' <-- feature (HEAD)
(I did not bother drawing in the abandoned commits, to save space.)
It is up to git rev-list
to pick an order for I-J
and K-L
during the list-commits-to-copy part of the rebasing process. Commit M
, the merge, is simply dropped: the two branches that resulted in merge commit M
are flattened into one simple linear chain. This avoids the need to copy commit M
, at the expense of sometimes not being able to copy the commits very well (having a lot of merge conflicts) and of course destroying our nice little merge bubble, if we wanted to keep it.
While you can run git cherry-pick
on a merge commit, the resulting commit is an ordinary, non-merge commit. Furthermore, you must tell Git which parent to use. Cherry-picking fundamentally has to diff the commit's parent vs the commit, but a merge has two parents, and Git simply does not know which of the two to use. You must tell it which one ... and then it copies the changes found by the diff, which is not what git merge
is all about.
git rebase
re-performs the mergesWhat this all means for git rebase
is that in order to "preserve" a merge, Git has to run git merge
itself.
That is, suppose we are given:
I--J
/ \
H M <-- feature (HEAD)
/ \ /
/ K--L
/
...--G-------N--O--P <-- mainline
and we want to achieve:
I'-J'
/ \
H' M' <-- feature (HEAD)
/ \ /
/ K'-L'
/
...--G-------N--O--P <-- mainline
Git's rebase can do this, but to do it, it must:
H
to H'
and drop a marker here;I
or K
to copy to I'
or K'
, then copy either J
or L
next; let's say we pick I-J
to do;J'
;git checkout
the H'
copy it made earlier using the marker;K
and L
now, to K'
and L'
, and drop a marker hereso that as our intermediate result-so-far, we have:
I'-J' <-- marker2
/
H' <-- marker1
/ \
/ K'-L' <-- marker3
/
...--G-------N--O--P <-- mainline
Git can now git checkout
commit J'
using marker 2, run git merge
on commit L'
using marker 3, and thereby produce commit M'
, a new merge that used H'
as its merge base and J'
and L'
as its two branch-tip commits.
Once the merge is done, the rebase-as-a-whole is done, and Git can delete the markers and yank the branch name feature
over as usual.
If we're a little clever, we can let HEAD
act as one of the three markers sometimes, but it's more straightforward to just drop markers each time. I'm not sure offhand which technique git rebase --rebase-merges
actually uses.
The label
, reset
, and merge
commands create and use the various markers. The merge
command requires that HEAD
point to the commit that will be the first parent of the resulting merge (since git merge
works that way). It's interesting that the syntax suggests that octopus merges are forbidden here: they should Just Work and hence should be allowed.
(The -C
in the merge
command can use the raw hash ID of the original merge commit, since that's always unchanged. The labels you'll see, if you use --rebase-merges
with a set of commits that contains merges, are generated by Git from the commit messages, and until quite recently there was a bug here.)
--ours
merges don't surviveWhen Git re-performs a merge, it just uses the regular merge engine. Git does not know about any flags used during the merge, or any changes introduced as an "evil merge". So -X ours
, or --ours
, or extra changes just get lost during this kind of rebase. Of course, if the merge has merge conflicts, you get a chance to re-insert evil-merge changes, or redo the merge entirely however you like.
See also Evil merges in git?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With