Is it possible to show the similarity index of two files in a Git repository using git diff
? According to the man pages, git diff -p
may produce patches with this information in certain cases, but the following command for example does not contain the similarity index information:
git diff -p --no-index a b
Where a
and b
are two files known to the repository. Is it possible to let Git calculate and report this similarity index between two existing files in a repository?
Unfortunately, no—or more precisely, not with any existing front-end command. The only way to get Git to compute a similarity index for two files is to create two tree objects in which it seems possible, to Git, that the file was renamed.
We can, however, do just that. Here's the method:
--find-renames=01
.(Using a rename threshold of 00
does not work: this just disables rename-detection.)
I wrapped this up into a script that is here and also appears below. Place the script somewhere in your $PATH
(I use $HOME/scripts/
as a directory containing executable scripts that run on any architecture) and you can run git similarity a b
.
(This is lightly tested.)
#! /bin/sh
#
# git-similarity: script to compute similarity of two files
. git-sh-setup # for die() etc
TAB=$'\t'
# should probably use OPTIONS_SPEC, but not yet
usage()
{
echo "usage: git similarity file1 file2"
}
case $# in
2) ;;
*) usage 1>&2; exit 1;;
esac
test -f "$1" || die "cannot find file $1, or not a regular file"
test -f "$2" || die "cannot find file $2, or not a regular file"
test "x$1" != "x$2" || die "file names $1 and $2 are identical"
TF=$(mktemp) || exit 1
trap "rm -f $TF" 0 1 2 3 15
export GIT_INDEX_FILE=$TF
# create a tree holding (just) the argument file
maketree() {
rm -f $TF
git add "$1" || exit 1
git write-tree || exit 1
}
# Use git diff-tree here for repeatibility. We expect output of
# the form Rnnn$TAB$file1$TAB$file2, but if we get two lines,
# with D and A, we'll just print 000 here.
print_similarity() {
set $(git diff-tree --name-status --find-renames=01 $1 $2)
case "$1" in
R*) echo "${1#R}";;
*) echo "000";;
esac
}
h1=$(maketree "$1")
h2=$(maketree "$2")
print_similarity $h1 $h2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With