I am somewhat new to git, I've been using it for a number of months, and Im comfortable doing most of the basic tasks. So... I think its time to take on some more complicated tasks. At my work, we have a few people working on older code to update it, this involves actual code work and updating the directory structure to be more modular. My question is can these two things be done in parallel branches and then merged or rebased. My intuition says no, because dir restructure is a rename, and git renames by adding a new file and deleting the old (least this is how i understand it). But I wanted to be sure.
Here's the scenario: parent-branch looks like:
├── a.txt ├── b.txt ├── c.txt
then we branch two say, branchA and branchB. In branchB we modify the structure:
├── lib │ ├── a.txt │ └── b.txt └── test └── c.txt
Then in branchA we update a,b, and c.
Is there someway to merge the changes done in branchA with the new structure in branchB? rebase comes to mind, however, I don't think lib/a.txt is actually connected to a.txt after a git mv...
Jameson
Git merge will combine multiple sequences of commits into one unified history. In the most frequent use cases, git merge is used to combine two branches.
First, a short note: you can always try a merge, then back it out, to see what it does:
$ git checkout master Switched to branch 'master' $ git status
(make sure it's clean—backing out of a failed merge when there's changes is not fun)
$ git merge feature
If the merge fails:
$ git merge --abort
If the automatic merge succeeds, but you don't want to keep it just yet:
$ git reset --hard HEAD^
(Remember that HEAD^
is the first parent of the current commit, and the first parent of a merge is "what was there before the merge". Thus, if the merge worked, HEAD^
is the commit just before the merge.)
Here's a simple recipe for finding out what renames git merge
will automatically detect.
Make sure diff.renamelimit
1 is 0
and diff.renames
is true
:
$ git config --get diff.renamelimit 0 $ git config --get diff.renames true
If these are not already set this way, set them. (This affects the diff
step below.)
Choose which branch you're merging-into, and which you're merging-from. That is, you are going to do something like git checkout master; git merge feature
soon; we need to know the two names here. Find the merge base between them:
$ into=master from=feature $ base=$(git merge-base $into $from); echo $base
You should see some 40-character SHA-1, like ae47361...
or whatever here. (Feel free to type out master
and feature
instead of $into
and $from
everywhere here. I am using the variables so that this is a "recipe" instead of an "example".)
Compare the merge base against both $into
and $from
to see which files are detected as "renames":
$ git diff --name-status $base $into R100 fileB fileB.renamed $ git diff --name-status $base $from R100 fileC fileD
(You might want to run these diffs with the output saved to two files, and then peruse the files later. Side note: you can get the effect of the third diff with special syntax, master...feature
: the three dots here mean "find the merge base".)
The two output sections have a list of files A
dded, D
eleted, M
odified, R
enamed, and so on (this example has just the two renames, with 100% matches).
Since $into
is master
, the first list is what git thinks has already happened in master
. (These are the changes git "wants to keep", when you merge-in feature
.)
Meanwhile, $from
is feature
, so the second list is what git thinks happened in feature
. (These are the changes git wants to "now add to master
", when you do the merge.)
At this point, you have to do a bunch of work:
R
, git will detect as renamed.R
lists are the same in both branches, you may be all good (but read on anyway). If there are R
s in the first list that are not in the second ... well, see below.git checkout master; git merge feature
(or git checkout $into; git merge $from
) git will do the renames shown in the second list, in order to "add those changes" to master
.D
and A
entries that you wanted to have show up as R
entries: these occur when, in one of the branches, you not only renamed the file, but also changed the contents so much that git no longer detects the rename.If the second list does not show everything you want to see, you're going to have to help git out. See even longer description below.
If the first list has a rename that's not in the second, this may be entirely harmless, or it may cause an "unnecessary" merge conflict and a missed chance for a real merge. Git is going to assume that you intend to keep this rename, and also look at what happened in the merge-from branch ($from
, or feature
in this case). If the original file was modified there, git will attempt to bring the changes from there into the renamed file. That is probably what you want. If the original file was not modified there, git has nothing to bring in and will leave the file alone. That's also probably what you want. The "bad" case is, again, an undetected rename: git thinks the original file was deleted in branch feature
, and a new file with some other name was created.
In this "bad" case, git will give you a merge conflict. For instance, it might say:
CONFLICT (rename/delete): newname deleted in feature and renamed in HEAD. Version HEAD of newname left in tree. Automatic merge failed; fix conflicts and then commit the result.
The problem here is not that git has retained the file under its new name in master
(we probalby want that); it's that git may have missed the chance to merge the changes made in branch feature
.
Worse—and this might be classifiable as a bug—if the new name occurs in the merge-from branch feature
, but git thinks it's a new file there, git leaves us with only the merge-into version of the file in the work tree. The message emitted is the same. Here, I made a few more changes in master
to rename fileB
to fileE
, and on feature
, made sure that git would not detect the change as a rename:
$ git diff --name-status $base master R100 fileB fileE $ git diff --name-status $base feature D fileB R100 fileC fileD A fileE $ git checkout master; git merge feature CONFLICT (rename/delete): fileE deleted in feature and renamed in HEAD. Version HEAD of fileE left in tree. Automatic merge failed; fix conflicts and then commit the result.
Note the potentially misleading message, fileE deleted in feature
. Git is printing the new name (the master
version of the name); that's the name it believes you "want" to see. But it is file fileB
that was "deleted" in feature
, replaced by an entirely new fileE
.
(git-imerge
, mentioned below, may be able to handle this particular case.)
1There's also a merge.renameLimit
(spelled with lowercase limit
in the source, but these configuration variables are case-insensitive) that you can set separately. Setting these to 0 tells git to use "a suitable default", which has changed over the years as CPUs have gotten faster. If a separate merge rename limit is not set, git uses the diff rename limit, and again a suitable default if that's not set or is 0. If you set them differently, merge and diff will detect renames in different cases, though.
You can also now set the "rename threshold" in a recursive merge with -Xrename-threshold=
, e.g., -Xrename-threshold=50%
. The usage here is the same as for git diff
's -M
option. This option first appeared in git 1.7.4.
Let's say you are on branch master
, and you do git merge 12345467
or git merge otherbranch
. Here's what git does:
Find the merge-base: git merge-base master 1234567
or git merge-base master otherbranch
.
This yields a commit-ID. Let's call that ID B
, for "Base". Git now has three specific commit IDs: B
, the merge base; the commit ID of the tip of the current branch master
; and the commit ID you gave it, 1234567
or the tip of branch otherbranch
. Let's just draw these in terms of the commit graph, for completeness; let's say it looks like this:
A - B - C - D - E <-- master \ F - G - H - I <-- otherbranch
If all goes well, git will produce a merge commit that has E
and I
as its two parents, but we want to concentrate here on the resulting work tree rather than the commit graph.
Given these three commits (B
E
and I
), git computes two diffs, a la git diff
:
git diff B E git diff B I
The first is the set of changes made on branch
, and the second is the set of changes made on otherbranch
, in this case.
If you run git diff
manually, you can set the "similarity threshold" for rename detection with -M
(see above for setting it during merge). Git's default merge sets automatic rename detection to 50%, which is what you get with no -M
option and diff.renames
set to true
.
If the files are "sufficiently similar" (and "exactly the same" is always sufficient), git will detect renames:
$ git diff B otherbranch # I tagged the merge-base `B` diff --git a/fileB b/fileB.txt similarity index 71% rename from fileB rename to fileB.txt index cfe0655..478b6c5 100644 --- a/fileB +++ b/fileB.txt @@ -1,3 +1,4 @@ file B contains several lines of stuff. +changeandrename
(In this case I just renamed from fileB
to fileB.txt
but the detection works across directories too.) Let's note that this is conveniently represented by git diff --name-status
output:
$ git diff --name-status B otherbranch R071 fileB fileB.txt
(I should also note here that I have diff.renames
set to true
and diff.renamelimit = 0
in my global git config.)
B
to I
(on otherbranch
) into the changes from B
to E
(on branch
).If git is able to detect that lib/a.txt
is renamed from a.txt
, it will connect them. (And you can preview whether it will by doing a git diff
.) In this case the automatic merge result is likely to be what you want, or sufficiently close.
If not, though, it won't.
When the automatic rename detection fails, there's a way to break up commits (or maybe they are already sufficiently broken-up) step-wise. For instance, suppose in the sequence of F
G
H
I
commits, one step (maybe G
) simply renames a.txt
to lib/a.txt
, and other steps (F
, H
, and/or I
) make so many other changes to a.txt
(under whatever name) to fool git into not realizing that the file was renamed. What you can do here is increase the number of merges, so that git can "see" the rename. Let's assume for simplicity that F
does not change a.txt
and G
renames it, so that the diff from B
to G
shows the rename. What we can do is first merge commit G
:
git checkout master; git merge otherbranch~2
Once this merge is complete and git has renamed from a.txt
to lib/a.txt
in the tree for the new merge commit on branch branch
, we do a second merge to bring in commits H
and I
:
git merge otherbranch
This two-step merge causes git to "do the right thing".
In the most extreme case, an incremental, commit-by-commit merge sequence (which would be extremely painful to do manually) will pick up everything that could be picked up. Fortunately someone has already written this "incremental merge" program for you: git-imerge
. I have not tried this but it's the Obvious Answer for hard cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With