Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git merge vs git rebase for merge conflict scenarios

I want to make sure that I am looking at this correctly.

When I do a git merge which results into a conflict, I see the file where there is conflict as:

<<<<<<<<<HEAD
my local changes first
=============
The remote github changes here.
>>>>>>>>>>

Whereas when I run into conflicts as a result of git rebase, I see the opposite:

<<<<<<<<<
The remote github changes here.
=============
my local changes first
>>>>>>>>>>

Am I missing anything here?

like image 927
The Roy Avatar asked Dec 04 '22 17:12

The Roy


1 Answers

Tim Biegeleisen's answer is right but I'd draw the diagram a bit differently. In your own (local) Git repository, you have a series of commits like this when you start:

...--G--H   <-- origin/somebranch
         \
          I--J   <-- somebranch (HEAD)

That is, you have made one or more of your own commits—here, I've labeled them I and J; their real names are some big ugly hash IDs—and your branch name, somebranch, points to (contains the hash ID of) the last of these new commits that you made.

You then run git pull --rebase somebranch, or (my preferred method) the two separate commands git fetch followed by git rebase origin/somebranch. The two-step sequence is what git pull does for you: it runs two Git commands, the first one always being git fetch, and the second one being a command you pick in advance, before you see what git fetch does. (I like to see what git fetch did, and then decide: do I rebase, or merge, or wait, or do something else entirely?)

The git fetch step picked up new commits that someone else made, giving you this:

...--G--H------K--L   <-- origin/somebranch
         \
          I--J   <-- somebranch (HEAD)

Again, the uppercase letters stand in for some real actual hash ID, whatever it might be.

When you use git merge, Git, well, merges. The graph can be drawn slightly differently to make it clearer:

          I--J   <-- somebranch (HEAD)
         /
...--G--H
         \
          K--L   <-- origin/somebranch

The common starting point for the merge is commit H; your commit, HEAD, is commit J; and their commit is of course L. So if there is a conflict, what you'll see in the in-progress merge as HEAD is your code from J and what you'll see as theirs is what's in L. If you set merge.conflictStyle to diff3, what you'll see as the base is what's in H.1

Note that there are three inputs to the merge. Commit H is the merge base, and commits J and L (HEAD and theirs) are the two branch tips involved. The final result of doing a full merge operation here will be a new merge commit M, which will point back to both of its two direct inputs:

          I--J
         /    \
...--G--H      M   <-- somebranch (HEAD)
         \    /
          K--L   <-- origin/somebranch

The snapshot in merge M is the result of applying the combined changes to the snapshot in commit H. That is, Git found:

  • the difference from H to J: what you changed;
  • the difference from H to L: what they changed;

and tried to combine these on its own. Git had a problem combining them—a merge conflict—and gave up and forced you to combine them. Once you did, and used git merge --continue to finish the process, Git made M from the combined results.

(Commit M does not remember commit H directly. Git can re-discover the merge base H later, if necessary, using the same process it used to find it this time.2)


1I like to set this option. That way, you see not just what you put in and what they put in, but also what was there originally in the merge base commit. This is especially useful when you or they deleted the code in question.

2This is actually kind of a bug, because you can run git merge with options that modify things, including—in some relatively rare cases—the merge base used. The merge command should record the options you used, to make the merge truly repeatable.


When you use git rebase, though, Git copies each of your existing commits—two, in this case—one at a time. This copying process uses a "detached HEAD", where HEAD points directly to a commit. Git starts by checking out their commit L as a detached HEAD, like this:

...--G--H------K--L   <-- HEAD, origin/somebranch
         \
          I--J   <-- somebranch

Now, technically, a cherry-pick is a form of merge, or as I like to put it, merge as a verb: the process of merging, without actually making a merge commit. That is, you are still doing all the same work you would do with git merge. The differences lie in the input commits to the merge, and that when you're done, the final commit isn't a merge commit: it's just a regular, ordinary, everyday commit, with one parent.

So, now that Git has done a git checkout --detach origin/somebranch so that their commit L is your current commit, it does a git cherry-pick <hash-of-I> to copy commit I. This cherry-pick starts off the merge process. The three inputs to this particular merge are:

  • the merge base, which is the parent of the commit Git was told to cherry-pick: that's H;
  • the --ours commit, which is always HEAD, and in this case is commit L: their commit; and
  • the --theirs commit, which is the commit Git was told to cherry-pick: that's I, which is your commit.

So the --theirs commit for the merge operation is your commit, and the HEAD or --ours commit for the merge operation is their commit L! This is where this apparent reversal comes from. Git is doing a cherry-pick, which is a form of merge. The --ours input is their commit and the --theirs input is your commit.

After you resolve any merge conflicts, you will run git rebase --continue. (If you had run the git cherry-pick yourself you would run git cherry-pick --continue; git rebase takes care of doing that for you.) This will have the cherry-pick finish up, which it does by making an ordinary commit:

                    I'  <-- HEAD
                   /
...--G--H------K--L   <-- origin/somebranch
         \
          I--J   <-- somebranch

The detached HEAD now points directly to this new ordinary commit, this copy I' of original commit I. Note that commit I' is "just like" commit I except that:

  • it has a different parent commit, L; and
  • it has a different snapshot. The snapshot in I' is the result of taking the difference between H to I—i.e., what you changed—and merging that difference with the difference between H and L.

Alas, because this is git rebase rather than git merge, we're not yet done. Now we must copy commit J too, as if by git cherry-pick <hash-of-J>. Our situation is still that the detached HEAD points to new commit I'. The three inputs to this merge are:

  • the merge base: the parent of J, i.e., I;
  • the HEAD commit as --ours: commit I', the one we just made; and
  • the to-be-copied commit as --theirs: commit J, i.e., your second commit.

As always for a merge, Git compares the snapshot in the merge base to each of the two tip commits. So Git:

  1. Compares your snapshot from your I to your own I', to see what you changed: that's their code you brought in via commit L. That is what will show up in <<<<<<< HEAD, if there is a conflict.
  2. Compares your snapshot from your I to your own J, to see what "they" changed: that's your change when you made J. This is what will show up in >>>>>>> theirs, if there is a conflict.

This time, instead of HEAD being just their code, it's now a mix of their code and your code, on the --ours side of a conflict. Meanwhile the --theirs side of any conflict continues to be their code. Once you resolve the conflicts and use git rebase --continue, Git will make a new ordinary commit J' like this:

                    I'-J'  <-- HEAD
                   /
...--G--H------K--L   <-- origin/somebranch
         \
          I--J   <-- somebranch

Here J' is the cherry-picked copy of J.

Since these are all the commits that had to be copied, Git now finishes off the rebase by yanking the name somebranch away from commit J and attaching it instead to new commit J', then re-attaching HEAD to the name somebranch:

                    I'-J'  <-- somebranch (HEAD)
                   /
...--G--H------K--L   <-- origin/somebranch
         \
          I--J   [abandoned]

and the rebase is complete. Running git log will show you your new copies, and no longer show you the original commits I and J. The original commits will get reclaimed and destroyed eventually (typically some time after 30 days have passed).

This is what makes rebasing fundamentally trickier than merging. A rebase involves repeated cherry-picks, and each cherry-pick is a merge. If you have to copy ten commits, you are doing ten merges. Git can usually do them automatically, and Git usually gets them right, but every merge is just Git stupidly applying some simple text-difference-combining rules, so every merge is an opportunity for errors. You must carefully inspect and/or test the result. Ideally, you should inspect and/or test all ten of these copies, but if the last one is good, probably all the others are too.

like image 60
torek Avatar answered Dec 08 '22 15:12

torek