This is a question which was asked many times over the years. I have found a number of answers, in particular this one:
Git - how to force merge conflict and manual merge on selected file (@Dan Moulding)
This page contains is a detailed guide how to set up a merge driver that would always return failure and thus make possible a manual merge. I have tried to adapt that solution for Windows:
I added the following to my %homepath%\.gitconfig
:
[merge "verify"]
name = merge and verify driver
driver = %homepath%\\merge-and-verify-driver.bat %A %O %B
I changed the driver to
cmd /K "echo Working > merge.log & git merge-file %1% %2% %3% & exit 1"
(echo Working > merge.log
was added to check whether the driver was invoked).
and, at the root of the repo, created a file .gitattributes
with the following line:
*.txt merge=verify
Unfortunately, it does not work. I tried to merge a file, feature.txt
, and, alas, the merge completed successfully. It seems that the driver was not invoked at all, since the merge.log file was not created.
Do I do anything wrong? Any solution to the problem of forcing manual merge is most welcome.
There are two parts to the problem. The relatively easy one is writing the custom merge driver, as you did in steps 1 and 2. The hard one is that Git doesn't actually bother running the custom driver if, in Git's opinion, it's not necessary. This is what you have observed in step 3.
So, when does Git run your merge driver? The answer is fairly complicated, and to get there we have to define the term merge base, which we'll get to in a moment. You also need to know that Git identifies files—in fact, pretty much everything: commits, files, patches, and so on—by their hash IDs. If you already know all of this, you can skip directly to the last section.
Hash IDs (or sometimes object IDs or OIDs) are those big ugly names you see for commits:
$ git rev-parse HEAD
7f453578c70960158569e63d90374eee06104adc
$ git log
commit 7f453578c70960158569e63d90374eee06104adc
Author: ...
Everything Git stores has a unique hash ID, computed from the contents of the object (the file or commit or whatever).
If you store the same file twice (or more), you get the same hash ID twice (or more). Since each commit ultimately stores a snapshot of every file as of the time of that commit, each commit therefore has a copy of every file, listed by its hash ID. You can in fact view these:
$ git ls-tree HEAD
100644 blob b22d69ec6378de44eacb9be8b61fdc59c4651453 README
100644 blob b92abd58c398714eb74cbe66671c7c3d5c030e2e integer.txt
100644 blob 27dfc5306fbd27883ca227f08f06ee037cdcb9e2 lorem.txt
The three big ugly IDs in the middle are the three hash IDs. Those three files are in the HEAD
commit under those IDs. I have the same three files in several more commits, usually with slightly different contents.
The DAG, or Directed Acyclic Graph, is a way of drawing the relationships between commits. To really use Git properly, you need at least a vague idea of what the DAG is. It's also called the commit graph, which is a nicer term in some ways since it avoids specialized informatics jargon.
In Git, when we make branches, we can draw them in any number of various ways. The method I like to use here (in text, on StackOverflow) is to put earlier commits on the left and later commits on the right, and to label each commit with a single uppercase letter. Ideally, we'd draw these the way Git keeps them, which is rather backwards:
A <- B <- C <-- master
Here we have just three commits, all on master
. The branch name master
"points to" the last of the three commits. This is how Git actually finds commit C
, by reading its hash ID from the branch name master
, and in fact the name master
effectively stores just this one ID.
Git finds commit B
by reading commit C
. Commit C
has, inside it, the hash ID of commit B
. We say that C
"points to" B
, hence the backwards-pointing arrow. Likewise, B
"points to" A
. Since A
is the very first commit, it has no previous commit so it has no back-pointer.
These internal arrows tell Git about the parent commit of each commit. Most of the time, we don't care that they are all backwards, so we can draw this more simply as:
A--B--C <-- master
which lets us pretend that it's obvious that C
comes after B
, even though in fact that's quite hard in Git. (Compare with the claim "B
comes before C
", which is very easy in Git: It's easy to go backwards, because the internal arrows are all backwards.)
Now let's draw an actual branch. Suppose we make a new branch, starting at commit B
, and make a fourth commit D
(it's not clear exactly when we make it but in the end it doesn't matter anyway):
A--B--C <-- master
\
D <-- sidebr
The name sidebr
now points to commit D
, while the name master
points to commit C
.
One key Git concept here is that commit B
is on both branches. It's on master
and sidebr
. This is true for commit A
as well. In Git, any given commit can be, and often is, on many branches simultaneously.
There's another key concept hidden in Git here that is quite different from most other version control systems, which I will just mention in passing. This is that the actual branch is actually formed by the commits themselves, and that the branch names have almost no meaning or contribution here. The names merely serve to find the branch tips: commits C
and D
in this case. The branch itself is what we get by drawing the connecting lines, going from newer (child) commits back to older (parent) commits.
It's also worth noting, as a side point, that this weird backwards linkage allows Git to never, ever change anything about any commit. Note that both C
and D
are children of B
, but we didn't necessarily know, back when we made B
, that we were going to make both C
and D
. But, because the parent doesn't "know" its children, Git did not have to store the IDs of C
and D
inside B
at all. It just stores the ID of B
—which definitely did exist by then—inside each of C
and D
when it creates each of C
and D
.
These drawings that we make show (part of) the commit graph.
A proper definition of merge bases is too long to go into here, but now that we've drawn the graph, an informal definition is very easy, and visually obvious. The merge base of two branches is the point at which they first come together, when we work backwards as Git does. That is, it's the first such commit that's on both branches.
Thus, in:
A--B--C <-- master
\
D <-- sidebr
the merge base is commit B
. If we make more commits:
A--B--C--F <-- master
\
D--E--G <-- sidebr
the merge base remains commit B
. If we actually make a successful merge, the new merge commit has two parent commits instead of just one:
A--B--C--F---H <-- master
\ /
D--E--G <-- sidebr
Here, commit H
is the merge, which we made on master
by running git merge sidebr
, and its two parents are F
(the commit that used to be the tip of master
) and G
(the commit that still is the tip of sidebr
).
If we now continue making commits, and later decide to do another merge, G
will be the new merge base:
A--B--C--F---H--I <-- master
\ /
D--E--G--J <-- sidebr
H
has two parents, and we (and Git) follow both parents "simultaneously" when we look backwards. Hence, commit G
is the first one that's on both branches, if and when we run another merge.
Note that F
is not, in this case, on sidebr
: we have to follow the parent links as we encounter them, so J
leads back to G
which leads back to E
, etc., so that we never get to F
when starting from sidebr
. If, however, we make our next merge from master
into sidebr
:
A--B--C--F---H--I <-- master
\ / \
D--E--G--J---K <-- sidebr
Now commit F
is on both branches. But in fact, commit I
is also on both branches, so even though this makes merges going both ways, we're OK here. We can get in trouble with so called "criss cross merges", and I will draw one just to illustrate the problem, but not go into it here:
A--B--C--E-G--I <-- br1
\ X
D---F-H--J <-- br2
We get this by starting with the two branches going out to E
and F
respectively, then doing git checkout br1; git merge br2; git checkout br2; git merge br1
to make G
(a merge of E
and F
, added to br1
) and then immediately also make H
(a merge of F
and E
, added to br2
). We can continue committing to both branches, but eventually, when we go to merge again, we have a problem picking a merge base, because both E
and F
are "best candidates".
Usually, even this "just works", but sometimes criss-cross merges create issues that Git tries to handle in a fancy way using its default "recursive" merge strategy. In these (rare) cases you can see some weird-looking merge conflicts, especially if you set merge.conflictstyle = diff3
(which I normally recommend: it shows you the merge base version in conflicted merges).
Now that we have defined the merge base and seen the way hashes identify objects (including files), we can now answer the original question.
When you run git merge branch-name
, Git:
HEAD
. This is also called the local or --ours
commit.branch-name
. That's the tip commit of the other branch, and is variously called the other, --theirs
, or sometimes remote commit ("remote" is a very poor name since Git uses that term for other purposes too).B
is also good but with a merge driver, %A
and %B
refer to the --ours
and --theirs
versions respectively, with %O
referring to the base.git diff
commands: git diff base ours
and git diff base theirs
.These two diffs tell Git "what happened". Git's goal, remember, is to combine two sets of changes: "what we did in ours" and "what they did in theirs". That's what the two git diffs
show: "base vs ours" is what we did, and "base vs theirs" is what they did. (This is also how Git discovers if any files were added, deleted, and/or renamed, in base-to-ours and/or base-to-theirs—but this is an unnecessary complication right now, which we will ignore.)
It's the actual mechanics of combining these changes that invokes merge drivers, or—as in our problem cases—doesn't.
Remember that Git has every object catalogued by its hash ID. Each ID is unique based on the object's contents. This means it can instantly tell whether any two files are 100% identical: they are exactly the same if and only if they have the same hash.
This means that if, in base-vs-ours or base-vs-theirs, the two files have the same hashes, then either we made no changes, or they made no changes. If we made no changes and they made changes, why then, obviously the result of combining these changes is their file. Or, if they made no changes and we made changes, the result is our file.
Similarly, if ours and theirs have the same hash, then we both made the same changes. In this case, the result of combining the changes is either file—they're the same, so it won't even matter which one Git picks.
Hence, for all of these cases, Git simply picks whichever new file has a different hash (if any) from the base version. That's the merge result, and there is no merge conflict, and Git is done merging that file. It never runs your merge driver because there is clearly no need.
Only if all three files have three different hashes does Git have to do a real three-way merge. This is when it will run your custom merge driver, if you have defined one.
There is a way around this, but it is not for the faint of heart. Git offers not just custom merge drivers, but also custom merge strategies. There are four built-in merge strategies, all selected via the -s
option: -s ours
, -s recursive
, -s resolve
, and -s octopus
. You can, however, use -s custom-strategy
to invoke your own.
The problem is that to write a merge strategy, you must identify the merge base(s), do any recursive merging you want (a la -s recursive
) in the case of ambiguous merge bases, run the two git diff
s, figure out file add/delete/rename operations, and then run your various drivers. Because this takes over the whole megillah, you can do whatever you want—but you must do quite a lot. As far as I know there is no canned solution using this technique.
tl;dr: I tried to repeat what you described and it seems to work. There were 2 changes compared to yours version but without them I have merge failed as well (because driver basically failed to run)
I have tried this:
Create a merge driver $HOME/bin/errorout.bat
:
exit 1
Create a section for the merge type
[merge "errorout"]
name = errorout
driver = ~/bin/errorout.bat %A %O %B
Create the .gitattributes file:
*.txt merge=errorout
After that, error is reported as I think you want it to be reported:
$ git merge a
C:\...>exit 1
Auto-merging f.txt
CONFLICT (content): Merge conflict in f.txt
Automatic merge failed; fix conflicts and then commit the result.
I have git version 2.11.0.rc1.windows.1. I was not able to make the complicated command as you specified run successfully, it was reporting some syntax errors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With