Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the order of Git merging matter?

Tags:

git

git-merge

Suppose I have two branches, A and B. Do the following properties hold?

  • Merging A into B conflicts if and only if merging B into A conflicts.
  • The contents of my files after merging A into B is the same as the contents of my files after merging B into A.
like image 401
Tom Ellis Avatar asked Apr 08 '18 07:04

Tom Ellis


People also ask

Does it matter which branch you merge from?

Does it matter if you merge from master to dev branches or vice versa? The only difference between the two only seems to be where HEAD ends up (which is on the branch you're merging to), but that doesn't matter at all as you can just checkout to the other branch, i.e. this has no impact on the history whatsoever.

Which way does Git merge work?

Git merging combines sequences of commits into one unified history of commits. There are two main ways Git will merge: Fast Forward and Three way. Git can automatically merge commits unless there are changes that conflict in both commit sequences.

Is Git merge symmetrical?

The answer is yes for default merges.

What is the best Git merge strategy?

The most commonly used strategies are Fast Forward Merge and Recursive Merge. In this most commonly used merge strategy, history is just one straight line. When you create a branch, make some commits in that branch, the time you're ready to merge, there is no new merge on the master.


2 Answers

cmaster's answer is correct, with caveats. Let's start by noting these items / assumptions:

  • There is always a single merge base commit. Let's call this commit B, for base.
  • The other two inputs are also single commits. Let's call them L for left / local (--ours) and R for right / remote (--theirs).

The first assumption is not necessarily true. If there are multiple merge base candidates, it is up to the merge strategy to do something about this. The two standard two-head merge strategies are recursive and resolve. The resolve strategy simply picks one at (apparent) random. The recursive strategy merges the merge bases two at a time, and then uses the resulting commit as the merge base. The one chosen by resolve can be affected by the order of arguments to git merge-base and hence to git merge, so that's one caveat right there. Because the recursive strategy can do more than one merge, there's a second caveat here that is difficult to describe yet, but it applies only if there are more than two merge bases.

The second assumption is much more true, but note that the merge code can run on a partially-modified work-tree. In this case all bets are off, since the work-tree does not match either L or R. A standard git merge will tell you that you must commit first, though, so normally this is not a problem.

Merge strategies matter

We already noted the issue with multiple merge bases. We're assuming a two-head merge as well.

Octopus merges can deal with multiple heads. This also change the merge base computation, but in general octopus merge won't work with cases that have complicated merge issues and will just refuse to run where the order might matter. I would not push hard on it though; this is another case where the symmetry rule is likely to fail.

The -s ours merge strategy completely ignores all other commits so merge order is obviously crucial here: the result is always L. (I am fairly sure that -s ours does not even bother computing a merge base B.)

You can write your own strategy and do whatever you want. Here, you can make the order matter, as it does with -s ours.

High level merging (with one merge base): file name changes

Git now computes, in effect, two change-sets from these three snapshots:

  • L - B, or git diff --find-renames B L
  • R - B, or git diff --find-renames B R

The rename detectors here are independent—by this I mean neither affects the other; both use the same rules though. The main issue here is that it's possible for the same file in B to be detected as renamed in both change-sets, in which case we get what I call a high level conflict, specifically a rename/rename conflict. (We can also get high level conflicts with rename/delete and several other cases.) For a rename/rename conflict, the final name that Git chooses is the name in L, not the name in R. So here, the order matters in terms of final file name. This does not affect the work-tree merged content.

Low level merging

At this point we should take a small tour of Git's internals. We have now paired up files in B-vs-L and in B-vs-R, i.e., we know which files are "the same" files in each of the three commits. However, the way Git stores files and commits is interesting. From a logical point of view, Git has no deltas: each commit is a complete snapshot of all files. Each file, however, is just a pair of entities: a path name P and a hash ID H.

In other words, at this point, there is no need to walk through all the commits leading from B to either L or R. We know that we have some file F, identified by up to three separate path names (and as noted above, Git will use the L path in most cases, but use the R path if there is only one rename in the B-vs-R side of the merge). The complete contents of all three files are available by direct lookup: HB represents the base file content, HL represents the left-side file, and HR represents the right-side file.

Two files match exactly if and only if their hashes match.1 So at this point Git just compares the hash IDs. If all three match, the merged file is the same as the left and right and base files: there is no work. If L and R match, the merged file is the L or R content; the base is irrelevant as both sides made the same change. If B matches either L or R but not the other, the merged file is the non-matching hash. Git only has to do the low-level merge if there is a potential for a low-level merge conflict.

So now, Git extracts the three contents and does the merge. This works on a line-by-line basis (with lines grouped together when multiple adjacent lines are changed):

  • If both left and right sides touched only different source lines, Git will take both changes. This is clearly symmetric.

  • If left and right touched the same source lines, Git will check whether the change itself is also the same. If so, Git will take one copy of the change. This, too, is clearly symmetric.

  • If left and right touched the same lines, but made different changes, Git will declare a merge conflict. The work-tree content will depend on the order of the changes, since the work-tree content has <<<<<<< HEAD ... ||||||| base ... ======= ... other >>>>>>> markers (the base section is optional, appearing if you choose diff3 style).

The definition of the same lines is a little tricky. This does depend on the diff algorithm (which you may select), since some sections of some files may repeat. However, Git always uses a single algorithm for computing both L and R, so the order does not matter here.


1To put this another way, if you manage to produce a Doppelgänger file—one that has different content from, but the same hash as, some existing file, Git simply refuses to put that file into the repository. The shattered.it PDF is not such a file, because Git prefixes the file's data with the word blob and the size of the file, but the principle applies. Note that putting such a file into SVN breaks SVN—well, sort of.


-X options are obviously asymmetric

You can override merge conflict complaints using -X ours or -X theirs. These direct Git to resolve conflicts in favor of the L or R change respectively.

Merging makes a merge commit, which affects merge base computation

This symmetry principle, even with the above caveats, is fine for a single merge. But once you have made a merge, the next merge you run will use the modified commit graph to compute the new merge base. If you have two merges that you intend to do, and you do them as:

git merge one    (and fix conflicts and commit if needed)
git merge two    (fix conflicts and commit if needed)

then even if everything is symmetric in each merge, that does not mean that you will necessarily get the same result as if you run:

git merge two
git merge one

Whichever merge runs first, you get a merge commit, and the second merge now finds a different merge base.

This is particularly important if you do have conflicts that you must fix before finishing whichever merge goes first, since that also affects the L input to the second git merge command. It will use the first merge's snapshot as L, and the new (maybe different) merge base as B, for two of its three inputs.

This is why I mentioned that -s recursive has potential order differences when working with multiple merge bases. Suppose there are three merge bases. Git will merge the first two (in whatever order they pop out of the merge base computation), commit the result (even if there are merge conflicts—it just commits the conflicts in this case), and then merge that commit with the third commit and commit the result. The final commit here is then the input B. Only if all parts of this process are symmetric will that final B result be order-insensitive. Most merges are symmetric, but we saw all the caveats above.

like image 101
torek Avatar answered Oct 12 '22 16:10

torek


I would go as far as to say that if your two properties are not met, then you have found a bug in git merge.

Rationale: Merging concurrently in all directions is the very purpose for which git has been built. That is why git has been using 3-way merges from the off: It's the only way to provide correct merge results. This 3-way merge is symmetric from a mathematic point of view, it basically computes a state R = (A - B) + (C - B) + B based on a base commit B from the diverged states A and C. The only difference that comes from merging order should be the order of the parents of the merge commit.


Edit: If you are interested in more details, torek's answer is what you are looking for. It gives you all the technicalities about the different merge strategies, and points out where my answer is imprecise due to being written at a very high abstraction level.

like image 14
cmaster - reinstate monica Avatar answered Oct 12 '22 14:10

cmaster - reinstate monica