git version 2.11.0.windows.1
Here is a bash snippet to reproduce my test repository:
git init
# Create a file
echo Hello > a.txt
git add a.txt
git commit -m 'First commit'
# Change it on one branch
git checkout -b feature
echo Hi > a.txt
git commit -am 'Change'
# Rename it on the other
git checkout master
git mv a.txt b.txt
git commit -m 'Move'
# Merge both changes
git merge --no-edit feature
At the end, git log --graph --pretty=oneline --abbrev-commit
prints:
* 06b5bb7 Merge branch 'feature'
|\
| * 07ccfb6 Change
* | 448ad99 Move
|/
* 31eae74 First commit
Now, I want to get the full log for b.txt
(ex-b.txt
).git log --graph --pretty=oneline --abbrev-commit --follow -- b.txt
prints:
...
* | 1a07e48 Move
|/
* 5ff73f6 First commit
As you can see, the Change
commit is not listed, even though it did modify the file.
I think I have tracked it down to the implicit use of --topo-order
by --graph
, since adding --date-order
brings the commit back, but that might be chance.
Additionally, adding -m
shows the merge commit (which is fine) and the Change
commit, but then the merge commit is duplicated:
* 36c80a8 (from 1a07e48) Merge branch 'feature'
|\
| | 36c80a8 (from 05116f1) Merge branch 'feature'
| * 05116f1 Change
* | 1a07e48 Move
|/
* 5ff73f6 First commit
What am I missing to explain the weird behaviour I'm witnessing?
How can I display cleanly all of the commits that changed a file, following through renames?
Skip option is used in Git Log to skip a number of commits in your Git Log. Let's first see how git log is shown so that the difference is clear to you. Type git log --oneline to see the list of commits. Now let us try to skip 4 commits by typing the following command: git log --skip 4 --oneline.
The --graph flag enables you to view your git log as a graph. To make things things interesting, you can combine this command with --oneline option you learned from above. One of the benefit of using this command is that it enables you to get a overview of how commits have merged and how the git history was created.
Git Log Oneline The oneline option is used to display the output as one commit per line. It also shows the output in brief like the first seven characters of the commit SHA and the commit message.
If you want to see what's happened recently in your project, you can use git log . This command will output a list of the latest commits in chronological order, with the latest commit first.
You're being bitten by git log
's cheap and sleazy implementation of --follow
, plus the fact that git log
often doesn't even look inside merges.
Fundamentally, --follow
works internally by changing the name of the file it's looking for. It does not remember both names, so when the linearization algorithm (breadth first search via priority queue) goes down the other leg of the merge, it has the wrong name. You are correct that the order of commit visits matters since it's when Git deduces a rename that Git changes the name of the file it's searching for.
In this graph (it looks like you ran the script several times because the hashes changed—the hashes here are from the first sample):
* 06b5bb7 Merge branch 'feature'
|\
| * 07ccfb6 Change
* | 448ad99 Move
|/
* 31eae74 First commit
git log
will visit commit 06b5bb7
, and put 448ad99
and 07ccfb6
on the queue. With the default topo order it will next visit 448ad99
, examine the diff, and see the rename. It is now looking for a.txt
instead of b.txt
. Commit 448ad99
is selected, so git log
will print it to the output; and Git adds 31eae74
to the visit queue. Next, Git visits 07ccfb6
, but it is now looking for a.txt
so this commit is not selected. Git adds 31eae74
to the visit queue (but it's already there so this is a no-op). Finally, Git visits 31eae74
; comparing that commit's tree to the empty tree, Git finds an added a.txt
so this commit gets selected.
Note that had Git visited 07ccfb6
before 448ad99
, it would have selected both, because at the start it is looking for b.txt
.
The -m
flag works by "splitting" a merge into two separate internal "virtual commits" (with the same tree, but with the (from ...)
added to their "names" so as to be able to tell which virtual commit resulted from which parent). This has the side effect of retaining both of the split merges and looking at their diffs (since the result of splitting this merge is two ordinary non-merge commits). So now—note that this uses your new repository with its new different hashes in the second sample—Git visits commit 36c80a8 (from 1a07e48)
, diffs 1a07e48
vs 36c80a8
, sees a change to b.txt
and selects the commit, and puts 1a07e48
on the visit queue. Next, it visits commit 36c80a8 (from 05116f1)
, diffs 05116f1
vs 36c80a8
, and puts 05116f1
on the visit queue. The rest is fairly obvious from here.
How can I display cleanly all of the commits that changed a file, following through renames?
The answer for Git is that you can't, at least not using what is built in to Git.
You can (sometimes) get a little closer by adding --cc
or -c
to your git log
command. This makes git log
look inside merge commits, doing what Git calls a combined diff. But this doesn't necessarily work anyway, because, hidden away in a different part of the documentation is this key sentence:
Note that combined diff lists only files which were modified from all parents.
Here is what I get with --cc
added (note, the ...
is literally there, in git log
's output):
$ git log --graph --oneline --follow --cc -- b.txt
* e5a17d7 (HEAD -> master) Merge branch 'feature'
|\
| |
...
* | 52e75c9 Move
|/
| diff --git a/a.txt b/b.txt
| similarity index 100%
| rename from a.txt
| rename to b.txt
* 7590cfd First commit
diff --git a/a.txt b/a.txt
new file mode 100644
index 0000000..e965047
--- /dev/null
+++ b/a.txt
@@ -0,0 +1 @@
+Hello
Fundamentally, though, you'd need git log
to be much more aware of file renames at merge commits, and to have it look for the old name down any leg using the old file name, and the new name down any leg using the new name. This would require that git log
use (most of) the -m
option internally on each merge—i.e., split each merge into N separate diffs, one per parent, so as to find which legs have what renames—and then keep a list of which name to use down which branches of merges. But when the forks come back together, i.e., when the multiple legs of the merge (which becomes a fork in our reverse direction) rejoin, it's not clear which name is the correct name to use!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With