Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git rebase when previous commit changed

Tags:

git

I frequently find myself working on two different work tickets that are on different git branches, but one is dependent on another, like this:

* later-branch
|
* earlier-branch
|
* some prior commit
|

(Each is a single commit here because we are using gerrit, but this question might apply to multiple commits for each as well.) The earlier branch may be going through review, and so I might have to go back and modify it at some point with a git commit --amend. As must happen, this will fork the history:

* earlier-branch
|
|  * later-branch
|  |
|  * previous version of earlier-branch
| /
* some prior commit
|

At this point I want to rebase the later-branch on top of the new version of earlier-branch. But if I just do a git checkout later-branch followed by a git rebase earlier-branch, it always gets conflicts, because (I think) it must first apply the previous version of earlier-branch commit to the most recent version of earlier-branch.

What I end up doing is git checkout earlier-branch -b new-later-branch-name followed by git cherry-pick later-branch and git br -D later-branch. Which is a pain. Can anyone suggest a better way to handle this?

like image 462
Computronium Avatar asked Jul 06 '17 09:07

Computronium


2 Answers

I see two easy ways to do this.

The first, best option is to use git rebase in interactive mode. To do this, you would do

git checkout later-branch
git rebase -i earlier-branch

In the screen that pops up, you would choose to drop the previous version of earlier-branch:

drop efb1c19 previous version of earlier-branch
pick a25ba16 later-branch

# Rebase 65f3afc..a25ba16 onto 65f3afc (2 commands)
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
...

This will rebase later-branch on top of earlier-branch, providing the following tree:

* later-branch
|
* earlier-branch
|
|  * previous version of earlier-branch
| /
* some prior commit
|

Another option is to simply do a git cherry-pick. If you do:

git checkout earlier-branch
git cherry-pick later-branch

You'll get the following tree:

* earlier-branch -> cherry-picked commit 1
|
* earlier-branch -> amended commit 0, now commit 2
|
|  * later-branch -> commit 1
|  |
|  * previous version of earlier-branch -> commit 0
| /
* some prior commit
|

So, in effect, this will produce the result you want, but it will advance earlier-branch by one. If the branch names are important to you, you could rename and reset them accordingly.

like image 143
houtanb Avatar answered Oct 16 '22 22:10

houtanb


Aside from interactive rebase (as in houtanb's answer), there are two more ways to do this somewhat, or much, more automatically:

  • using git rebase --onto, or
  • using the "fork-point" code (in Git since Git version 2.0).

To use the latter, you can run git rebase --fork-point earlier-branch when on later-branch.

(You can instead set earlier-branch as the upstream for later-branch—presumably just temporarily, for the duration of the rebasing—and then just run git rebase when on later-branch. The reason is that --fork-point is the default when using the automatic upstream mode, but must be explicitly requested when using an explicit <upstream> argument to git rebase.)

Unfortunately, the last is especially magical-seeming, especially to those new to Git. Fortunately, your diagram has in it the seeds to understanding it—and with it, git rebase --onto.

Defining fork-point

Let's take what you drew above and turn it sideways, then turn it even a little a bit more. This gives me some room to draw in the branch names. I'll replace the *s for each commit with round o nodes or uppercase letters and numbers. I'll add a third commit, C, to the later branch as well just for illustration.

:
 .
  \
   o
    \
     A1   <-- earlier-branch
      \
       B--C   <-- later-branch

Now you are forced, for whatever reason, to copy commit A1 to new commit A2, and move the branch label earlier-branch to point to the new copy:

:
 .
  \
   o--A2  <-- earlier-branch
    \
     A1
      \
       B--C   <-- later-branch

If only Git would remember that commit A1 exists because earlier-branch used to contain commit A1, we could tell Git: "when copying later-branch, drop any commits that are still on it now, but used to be on it only because of earlier-branch".

But Git does remember this, for 30 days at least, by default. Git has reflogs—logs of what used to be stored in each reference (including both regular branches and Git's so-called remote tracking branches). If we add the reflog information to the drawing, it looks like this:

:
 .
  \
   o--A2   <-- earlier-branch
    \
     A1    <-- [earlier-branch@{1}]
      \
       B--C   <-- later-branch

In fact, if for some reason you must copy A2 to A3, the diagram just grows another reflog entry, renumbering the existing one:

:
 .   A3    <-- earlier-branch
  \ /
   o--A2   <-- [earlier-branch@{1}]
    \
     A1    <-- [earlier-branch@{2}]
      \
       B--C   <-- later-branch

What the fork-point code does is to scan the reflog for some other reference, such as earlier-branch, and find these commits (in this case A1—it actually finds both A1 and A2, in the latter case, but then winnows it down to the A1 that's on both branches; see also Git rebase - commit select in fork-point mode). It then runs git rebase --onto for you, as if you had manually run:

git rebase --onto earlier-branch hash-of-A1

which gets us into how the --onto argument works.

Regular rebase, without --onto

Normally, you would run git rebase with one argument, as in git rebase branch-name, or even no arguments at all. With no arguments at all, git rebase uses the current branch's upstream setting. With a branch-name argument, git rebase calls that argument <upstream>. (As an odd side effect, this also—since Git version 2.0 anyway—automatically enables or disables the --fork-point option, requiring you to use an explicit --no-fork-point or --fork-point if you want the other mode.)

In any case, Git uses the <upstream>—selected automatically if you did not specify one—for two purposes. One is to limit the set of commits that will be copied: Git will consider copying the set of commits listed by running:

git rev-list <upstream>..HEAD

To see them in a more friendly fashion, use git log, or my preferred method, git log --oneline --decorate --graph, instead of git rev-list here:

git log --oneline --decorate --graph earlier-branch..HEAD

Ideally, we would see commits B and C here, with C listed first (Git has to use --reverse to make sure it copies B first). If you copied A1 to A2 and/or on to A3, though, and moved the branch earlier-branch, we will see all of A1, B, and C. (Git excludes A2 or A3—whichever earlier-branch points to—but they're not on the list anyway. It then uses the excluded A2 or A3 to exclude the commits before A1, so that's why we don't see those.)

The other purpose for this <upstream> branch name (or commit hash) is to select where the copies go. When we copy one or more commits, each copied commit has to go after some existing commit. The <upstream> argument provides the ID of the commit that will be the parent of the first commit we copy.

Hence, running git rebase earlier-branch makes Git list commits A1, B, and C, in that order. It then—using the "detached HEAD" mode—copies A1 to go after earlier-branch:

:
 .       A1'  <-- HEAD
  \     /
   o--A2   <-- earlier-branch
    \
     A1    <-- [earlier-branch@{1}]
      \
       B--C   <-- later-branch

and then copies B to go after A1':

:
 .       A1'--B'  <-- HEAD
  \     /
   o--A2   <-- earlier-branch
    \
     A1    <-- [earlier-branch@{1}]
      \
       B--C   <-- later-branch

Rebase then copies C to C' and moves the branch label, later-branch, to wherever HEAD winds up, re-attaching your HEAD in the process:

:
 .       A1'--B'--C'  <-- later-branch (HEAD)
  \     /
   o--A2   <-- earlier-branch
    \
     A1    <-- [earlier-branch@{1}]
      \
       B--C   <-- [later-branch@{1}]

The --onto argument lets you tell Git where the copies go.

Using --onto with git rebase

When you add --onto, you tell Git rebase where to put the copies. This frees up the <upstream> argument so that it now specifies only what not to copy! So now you are free to tell Git: "copy everything that's after commit A1" by writing:

git rebase --onto earlier-branch <hash-of-A1>

Git does its usual thing, listing the commits to be copied (B and C), detaching your HEAD from later-branch, copying the commits one at a time with the copies going after the tip of earlier-branch, and finally moving the name later-branch to reattach your HEAD.

This is exactly what we wanted, all done semi-automatically: we tell Git not to copy A1 itself, so that it copies just B and C.

When we specify an upstream, as in git rebase earlier-branch, Git disables the fork-point mode. If we explicitly enable it, Git will fish through the earlier-branch reflog. As long as the reflog entry for commit A1 has not yet expired, Git will discover that A1 used to be on earlier-branch and will use --onto for us to discard it from the to-copy list.

Note that there is a bit of danger here. What if we really wanted A1 after all, e.g., what if we backed earlier-branch up over A1 only because we realized A1 did not belong on the other branch? Git will still think we copied it to some other commit, and don't want it copied now, and will toss off the list. Fortunately, you can always undo a rebase: rebase doesn't discard anything at all, it just copies. It then updates a branch, which saves the previous value in the branch's reflog. But fishing around through reflogs, trying to find one particular set of commits, in a twisty maze of commits that are all alike, is not a lot of fun—so it is wise to think a bit before running rebase, with or without --fork-point.

Side note

In a few (rare) cases Git you don't have to do anything (no fork point mode, no manual --onto separation, no --interactive). Specifically, if the patch itself did not change at all, but only the wording in the commit message changed, Git will detect the already-copied commit and skip it. This happens because git rebase actually uses the symmetric difference mode of git rev-list with the --cherry-pick --right-only --no-merges options. That is, rather than:

git rev-list <upstream>..HEAD

Git actually runs:

git rev-list --cherry-pick --right-only --no-merges <upstream>...HEAD

(note the three dots). I don't have time to go into any more detail here, though.

like image 33
torek Avatar answered Oct 16 '22 21:10

torek