How to join last N merge commits into one?

Tags:

git

In my repository I currently have this history:

$ git log --oneline
<commit-id-1> Merge commit '<merged-commit-id-1>' into <branch>
<commit-id-2> Merge commit '<merged-commit-id-2>' into <branch>
...

where <merged-commit-id-1> and <merged-commit-id-2> were merged from another branch from which I created the current branch previously. Now I want to join those merge commit into one somehow:

$ git log --oneline
<commit-id> My message about huge successful merge here...
...

I tried

$ git rebase --preserve-merges -i HEAD~2

with p and s but get this error:

Refusing to squash a merge: ...

(I also reproduced it in a new simplest possible repository) Is there a way around it?

What I really want is to be able to merge a lot of commits by their IDs from the original branch (which was changed) while resolving conflicts gradually and not introducing dozens of merge commits (merge --squash is not an option either, since I need to preserve the history).

Example:

$ git init
$ echo 1 > 1; git add 1; git commit -m 1
$ git branch branch
$ echo 2 > 2; git add 2; git commit -m 2
$ echo 3 > 3; git add 3; git commit -m 3
$ git checkout branch
$ echo 4 > 4; git add 4; git commit -m 4
$ git log master --oneline
8222... 3
1f03... 2
... 1
$ git merge 1f03
# in real project here goes some work
$ git merge 8222
# and here, can't merge 8222 right away, because it can be difficult
# now need to clean up merge commits OR merge incrementally in a different way
$ git rebase -i --preserve-merges HEAD~2 
p ...
s ...
Refusing to squash a merge: a199... (merge commit for 8222)

Solution by torek:

git update-ref refs/heads/branch `git commit-tree -p branch~N -p 8222 branch^{tree} -m 'Commit message for a new merge commit'`

545

asked Sep 22 '15 21:09

user5365198

1 Answers

First some background ... merging, in git, has two functions:

It compares the "merge base" of two lines of development to the two tip commits, and then combines the changes, using a semi-intelligent¹ algorithm. That is, git merge other first computes $base,² then does git diff $base HEAD and git diff $base other. These two diffs tell git what you want done to $base, e.g., add a line to foo.txt, remove a line from bar.tex, and change a line in xyz.py. Git then tries to keep all these changes, and if the line added to foo.txt was added the same way in each branch, or the same line was removed from bar.tex, just keep one copy of the change.
It records, for future use, the fact that two lines of development were brought together, and that all the desired changes from both lines are now present in the "merge commit". The resulting commit is therefore suitable as a new merge-base, for example. We'll see something like this (not exactly this) in a moment.

These two functions are implemented in quite different ways. The first one is done by merging the diffs in your work-tree, leaving any too-difficult cases for you to handle manually. The second one is done at the time you make the merge commit, by listing two³ "parent commit IDs" in the new (merge) commit.

What you'll need to decide is how much of this to preserve, in what way. I think it helps immensely, in these cases, to draw (part of) the commit graph.

Here, in your example you start with a common base, which I will call B (for base), then have two more commits on master and one on branch:

B - C - D    <-- master
  \
    E   <-- branch

Now on branch you first merge commit C:

B - C - D   <-- master
  \   \
    E - F   <-- branch

Whether this needs manual intervention depends on the change from B to C, and the change from B to E, but in any case this makes new merge commit F.

You then ask git to merge in commit D. This, however, no longer uses commit B as the merge base, because following the new merge F backwards, git finds that the merge base—the most recent commit—between branch and master—is now commit C. The comparisons are therefore C to D, and C to F. Git combines these (perhaps having to stop and get help); when someone (you or git) commits the result, we get:

B - C - D   <-- master
  \   \   \
    E - F - G   <-- branch

Now, you're asking (I think) to "squash" F and G into a single merge commit. Given the way git works, you can only achieve this by creating a new merge commit, then making branch point to it (dropping the reference to G and hence the only reference to F as well).

There are two things to consider when making this new merge commit:

What tree (set of files/directories attached to the commit, will be made out of the index/staging-area) do you want? You can only have one!
What parent commits do you want to list? To make it come right after commit E, you want its "first parent" to be E. To make it a merge commit, it must have at least one other parent.

The tree to choose is obvious: commit G has the desired merge-result, so use the tree from G.

The second (and perhaps third) parent to use is less obvious. It's possible to list both C and D, giving an octopus merge O:

B - C = D   <-- master
  \    \ \
    E --- O   <-- branch

But this doesn't do anything particularly useful, because C is an ancestor of D. If you made a more normal-looking merge commit M that just points back to both E (as first parent) and D) (as second parent), you'd get this:

B - C - D   <-- master
  \      \
    E ---- M   <-- branch

This would be "just as good" as the octopus merge: it has the same tree (we're picking the tree out of commit G each time) and it has the same first-parent (E). It has just one other parent (i.e., D), but that causes C to be in its history, so it doesn't need an explicit connection directly to C.

(It can have an explicit connection, if you want one, but there's no particularly good reason to give it one.)

That leaves one last problem: the mechanics of actually producing this merge (M or O, whichever you prefer) and getting branch to point to it.

There's a "plumbing" command that creates it directly, assuming you're in this state at the moment:

B - C - D   <-- master
  \   \   \
    E - F - G   <-- branch

At this point you can run (note, these are untested):

$ commit=$(git commit-tree -p branch~2 -p master branch^{tree} < msg)

where msg is a file containing the desired commit message. This creates commit M (with just the two parents, the first one being commit E, i.e., branch~2 and the second being commit D, i.e., master). Then:

$ git update-ref refs/heads/branch $commit

It's possible to create M using just ordinary git "porcelain" commands, but it requires a bit more trickiness. First we want to save the desired tree, i.e., branch^{tree}:

$ tree=$(git rev-parse branch^{tree})

Then, while on branch branch (because git reset will use the current branch), tell git to back up two commits. It doesn't matter whether we use soft, hard, or mixed reset here, we're just rewinding the label branch for the moment:

$ git reset HEAD~2

Then, tell git we're merging master, but don't commit no matter what. (This is just to get git to set up .git/MERGE_MSG and .git/MERGE_HEAD -- you could write those files directly, instead of doing this part.)

$ git merge --no-commit master

Then wipe out whatever's in the work tree and index, replacing it all with the saved tree from commit G instead:

$ git rm -rf .; git checkout $tree -- .

(you'll need to be in the top level directory of the work tree for this). This prepares the index for the new commit, and now there's just that one last commit to make:

$ git commit

This commits the merge, using the tree whose ID we grabbed before doing the git reset. The git reset made the tip commit of branch be commit E, so our new commit is merge-commit M, the same as if we had used the low level plumbing commands.

(Note: git rebase -i -p can't quite do this, it's not smart enough to know when this is OK—which is only when we have set it up to be OK in the first place.)

¹It's not particularly smart, but it handles the easy cases pretty well.

²You can compute this yourself using git merge-base, with base=$(git merge-base other). For some (uncommon but not as rare as one might hope) cases, though, there may be more than one suitable merge base. The default merge algorithm ("recursive") will merge two merge-bases to come up with a "virtual base" and use that. So there are some subtle differences if you try to do this all manually.

³There can be more than two parents: the definition of a "merge commit" is any commit with more than one parent. Multi-way merges ("octopus" merges) are done differently, though (using the "octopus merge" strategy). So, for the most part, I'm ignoring these here.

139

answered Oct 10 '22 19:10

torek

Related questions
                            
                                How to use jenkins git plugin to build from a branch include origin or not
                            
                                Subsequent changes to file after "git add"
                            
                                Version control for prose
                            
                                Large files extension for git [closed]
                            
                                Using git with rtc -- how about rsync?
                            
                                Deleted files still present in Git remote repo after push?
                            
                                How to replace local git hooks with updated versions with git init?
                            
                                Releasing multiple Maven artifacts when using nested Git submodules
                            
                                How to set tab size for hunks in "git add -p"?
                            
                                Show all commits in a git branch since original branching point from master
                            
                                Cordova CLI, using Git, and saving plugins/platforms
                            
                                Configure Git with SSH for Phabricator
                            
                                Undo git checkout overwrite of uncommitted files
                            
                                Using IntelliJ as git mergetool always exits as soon as soon as it starts
                            
                                How to push a git ignored folder to a subtree branch?
                            
                                What files/directories does git ignore by default?
                            
                                Git extensions "Create new repository" not working
                            
                                How to solve "Unable to initialize SMTP properly." when using using git send-email?
                            
                                SourceTree GUI can't push, terminal can push
                            
                                Bundler keeps removing 'BUNDLED WITH' section of Gemfile.lock

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With