Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Branching off of squashed branches

Suppose I have the following git history: a master branch starting with commit A, a feature-1 branch branched off of A with commits B and C, and a second feature branch feature-2 that built off of commit C with commits D and E.

master     A
            \
feature-1    B--C
                 \
feature-2         D--E

Now suppose that commit C has been tested and is ready to merge in, so we use git switch master; git merge feature-1 --squash.

master     A------C'
            \    /
feature-1    B--C
                 \
feature-2         D--E

The history for master is nice and clean with just commits A and C', but if we now want to compare master and feature-2 (e.g., git log master..feature-2) we end up seeing all of the commits from feature-1 that were already merged in.

Question 1: Is there an easy way to squash the history for feature-2 to match the squashed merge? What if the history is a little more complicated and there were more commits after the branch point C on feature-1 that were squash-merged into master?

Question 2: Assuming that rewriting history is hard (or can only be tediously done with a git rebase -i; I've got way more than two commits on each branch), is there any way to view only the commits in feature-2 that weren't squash-merged into master? When performing a pull request on GitHub or Bitbucket for feature-2 -> master, is there any way to only list those genuinely new commits?

like image 215
clwainwright Avatar asked Aug 02 '20 16:08

clwainwright


2 Answers

Now suppose that commit C has been tested and is ready to merge in, so we use git switch master; git merge feature-1 --squash.

master     A------C'
            \    /
feature-1    B--C
                 \
feature-2         D--E

This drawing isn't quite right: it should read the way I've drawn below. Note, I've moved the names to the right as well, for reasons that should become clearer in a moment. I also called the squash commit BC, which is an attempt to make it clear that there is a single commit that does what B-and-C did together.

What you drew was a real merge (although you called the merge commit C'). As matt said, a "squash merge" isn't a merge at all.

A--BC   <-- master
 \
  B--C   <-- feature-1
      \
       D--E   <-- feature-2

At this point, there's almost no reason to keep the name feature-1. If you delete it, we can redraw the graph like this:

A--BC   <-- master
 \
  B--C--D--E   <-- feature-2

Note that commits A-B-C-D-E are all on branch feature-2 (regardless of whether we delete the name feature-1); commit BC is only on master.

The main reason to retain the name feature-1 is that it identifies commit C, which makes it easy to copy commits D and E (and no others) to new and improved commits D'-E'.

Question 1: Is there an easy way to squash the history for feature-2 to match the squashed merge?

It's not completely clear to me what you mean by "squash the history". Having run the above git merge --squash, though, the snapshot in commit BC will match (exactly) the snapshot in commit C, so running:

git switch feature-2 && git rebase --onto master feature-1

(note the --onto here1) will tell Git to copy commits D and E (only) with the copies going after commit BC, like this:

      D'-E'  <-- feature-2 (HEAD)
     /
A--BC   <-- master
 \
  B--C   <-- feature-1
      \
       D--E   [abandoned]

It's now safe to delete the name feature-1 as we no longer need something to remember the hash ID of commit C. If we stop drawing in the abandoned commits, we end up with:

A--BC   <-- master
     \
      D'-E'  <-- feature-2

which might be what you wanted.


1Normally, git rebase takes one name or commit hash ID. It then:

  1. lists out some set of commits to copy, using a commit hash ID as a limiter;
  2. does the equivalent of git switch --detach on a commit hash ID;
  3. copies the commits listed in step 1;
  4. moves the branch name that you were on before step 2, to point to the last commit copied by step 3; and
  5. does the equivalent of git switch back to the branch name just moved in step 4.

When not using --onto, the commit hash IDs in steps 1 and 2 are the same. When using --onto, the commit hash IDs in steps 1 and 2 are, or at least can be, different. So with --onto we can tell Git: Only copy some commits, rather than many commits.

Specifically, without --onto, we'll copy all the commits that are reachable from HEAD, but not reachable from the (single) argument, and the copies will go to the (single) argument. With --onto, we can say: Copy commits reachable from HEAD but not from my specified limiter, to the place specified by my separate --onto argument. In this case that lets us say do not attempt to copy B and C.


On the other hand, you can also simply run:

git switch master             # if needed - you're probably already there
git merge --squash feature-2

if you just wanted a single squash-merge of the D-E chain:

A--BC--DE   <-- master (HEAD)
 \
  B--C   <-- feature-1
      \
       D--E   <-- feature-2

This git merge --squash will usually go smoothly as well, because, like regular git merge, git merge --squash starts by:

  • finding the merge base (A in this case);
  • diffing the merge base against the current commit (BC, because HEAD is master which identifies commit BC); and
  • diffing the merge base against the specified commit (E, because feature-2 names commit E).

The first diff shows what B+C did because BC's snapshot matches Cs, and the second shows what B+C+D+E did, because E's snapshot is the result of B plus C plus D plus E. So unless D and/or E specifically un-does something B and/or C did, the two sets of changes are likely to merge automatically.

(Note that the rebase always goes smoothly here, even if D and/or E undo something.)

The difference between a squash-not-really-a-merge and a real merge is limited to the final commit: the squash has a commit with a single parent, in this case BC, while a real merge has a commit with two parents. In this case a real merge would give you BC as one parent, and E as the other. You probably want to rebase away the B and C commits first if you like having the BC squash-merge.

What if the history is a little more complicated and there were more commits after the branch point C on feature-1 that were squash-merged into master?

As always, the trick is to draw an actual graph. We might start with this:

A   <-- master
 \
  B--C--F--G   <-- feature-1
      \
       D--E   <-- feature-2

which, after git switch master && git merge --squash feature-1, produces:

A--BCFG   <-- master
 \
  B--C--F--G   <-- feature-1
      \
       D--E   <-- feature-2

It's now appropriate to use:

git switch feature-2 && git rebase --onto master feature-1

Note that this is the same command that we used in the earlier situation. It says (compare with the steps in footnote 1 above):

  1. List out commits reachable from feature-2 (where we are after the git switch) but not from feature-1. The commits reachable from feature-2 are A-B-C-D-E, and the commits reachable from feature-1 are A-B-C-F-G. Subtracting A-B-C-F-G from A-B-C-D-E leaves D-E.

  2. Get onto a detached HEAD at master, i.e., commit BCFG.

  3. Copy the commits listed in step 1, i.e., D and E.

  4. Yank the branch name (feature-2) around to where we are now (commit E').

  5. Do the equivalent of git switch feature-2 again.

The result is:

        D'-E'  <-- feature-2 (HEAD)
       /
A--BCFG   <-- master
 \
  B--C--F--G   <-- feature-1
      \
       D--E   [abandoned]

after which it's safe to delete the name feature-1: we no longer need an easy way to find commit C via commit G any more.

Question 2: Assuming that rewriting history is hard (or can only be tediously done with a git rebase -i; I've got way more than two commits on each branch) ...

As you can see above, this isn't necessarily a correct assumption. How hard the rebase is depends on how many merge conflicts you get with each to-be-copied commit, which depends on what happened after the last common commit (C in the drawings above). Still:

... is there any way to view only the commits in feature-2 that weren't squash-merged into master?

The git log command has a simple syntax for this, as long as you still have the name feature-1 identifying the appropriate commit, as in the various drawings above:

git log feature-1..feature-2

does just that. This syntax means all commits reachable by starting at feature-2 and working backwards, minus all commits reachable by starting at feature-1 and working backwards. Note that this is the same set of commits that we copied with our git rebase operations in the examples above.2

When performing a pull request on github or bitbucket for feature-2 -> master, is there any way to only list those genuinely new commits?

No, because these systems do not have the equivalent syntax. However, once you use rebase to copy just the desired commits, and force-push to make the GitHub or Bitbucket repository match, they'll show what you wanted.


2Not mentioned above is the fact that git rebase deliberately omits certain commits in step 1 by default. In your case, there are no commits that should be omitted here, so this is not really relevant, but it is worth mentioning:

  • By default, git rebase omits all merge commits.
  • By design (and there's no option to stop this), git rebase also uses the same computations that git cherry or git log --cherry-pick would use to eliminate from the copying any commits whose patch-id matches a commit in the upstream set of commits. (This set is hard to define without getting into the details of how the A...B symmetric difference notation works.) In your case that doesn't matter either, because this kind of patch-ID matching is extremely unlikely here. It's meant more for the case where someone upstream deliberately used git cherry-pick to copy one or more of your commits to the branch on which you are going to rebase.
  • In some cases—but once again not yours—git rebase defaults to running git merge --fork-point to find commits to omit, and this can produce surprising results.

The rebase documentation has historically been lax about mentioning these, probably because they don't come up all that often. In your case, they should not come up. The latest rebase documentation is greatly improved.

like image 173
torek Avatar answered Oct 13 '22 22:10

torek


in the simple scenario, if you simply rebase your feature-2 on master, you'll only have non-empty commits.

in the more complex scenario, I would propose to do the following:

git switch feature-2
git merge origin/master
git reset --soft origin/master
git commit -m 'feature 2'

this will result in a single commit which holds all changes from feature-2 effectively squashed on top of master

like image 28
Nitsan Avni Avatar answered Oct 14 '22 00:10

Nitsan Avni