Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to control hash length in git diff header

Tags:

git

git-diff

The output of git diff contains a header like so:

index f8fdb16de,78132574a..000000000

In git help diff, this is explained like this:

2. It is followed by one or more extended header lines (this example shows a merge with two parents):

       index <hash>,<hash>..<hash>
       mode <mode>,<mode>..<mode>
       new file mode <mode>
       deleted file mode <mode>,<mode>

I want to use git diff to create patches, and I want these patches to have a predictable format so they can be compared. To do this, I need a fixed length for the hash in the "index .." header.

How can I control the length of these hashes?

I tried --abbrev=7, but it seems to have no effect.

I still see my patches updated like this:

-index 52a2a35..7813257 100755
+index 52a2a357e..78132574a 100755
like image 922
donquixote Avatar asked Jan 14 '19 19:01

donquixote


People also ask

What does git diff head do?

The git diff HEAD [filename] command allows you to compare the file version in your working directory with the file version last committed in your remote repository. The HEAD in the git command refers to the remote repository.

How do you diff commits in git?

You can run the git diff HEAD command to compare the both staged and unstaged changes with your last commit. You can also run the git diff <branch_name1> <branch_name2> command to compare the changes from the first branch with changes from the second branch. Order does matter when you're comparing branches.

What is git diff command?

Comparing changes with git diff Diffing is a function that takes two input data sets and outputs the changes between them. git diff is a multi-use Git command that when executed runs a diff function on Git data sources. These data sources can be commits, branches, files and more.

What does M mean in git diff?

^M represents carriage return. This diff means something removed a Unicode BOM from the beginning of the line and added a CR at the end.


Video Answer


2 Answers

The --abbrev option only works for "diff-raw format output and diff-tree header lines". For standard patch output, you can use git diff --full-index to get a complete, unabbreviated blob ID. From the man page:

--full-index
Instead of the first handful of characters, show the full pre- and post-image blob object names on the "index" line when generating patch format output.

This will produce output like:

diff --git a/foo b/foo
index c7bc37b70c7e29e3e4ed048c22ca3929367aa171..ab10096fde76d8c1d6172bd09d0dc4a18fb2c2fa 100644
Binary files a/foo and b/foo differ
like image 70
Edward Thomson Avatar answered Oct 18 '22 05:10

Edward Thomson


I tried --abbrev=7, but it seems to have no effect.
so it is not possible to get a fixed abbrev length?

It is, but only with With Git 2.29 (Q4 2020): Before, the output from the "diff" family of the commands had abbreviated object names of blobs involved in the patch, but its length was not affected by the --abbrev option.
Now it is.

See commit 3046c7f (21 Aug 2020) by Đoàn Trần Công Danh (sgn).
See commit fc7e73d (21 Aug 2020) by brian m. carlson (bk2204).
(Merged by Junio C Hamano -- gitster -- in commit 096c948, 31 Aug 2020)

diff: index-line: respect --abbrev in object's name

Signed-off-by: Đoàn Trần Công Danh

A handful of Git's commands respect --abbrev for customizing length of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different purposes:

  • --full-index for showing diff-patch object's name in full, and
  • --abbrev to customize the length of object names in diff-raw and diff-tree header lines, without any options to customise the length of object names in diff-patch format.

When working with diff-patch format, we only have two options, either full index, or default abbrev length.

Although, that behaviour is documented, it doesn't stop users from trying to use --abbrev with the hope of customising diff-patch's objects' name's abbreviation.

Let's allow the blob object names shown on the "index" line to be abbreviated to arbitrary length given via the "--abbrev" option.

To preserve backward compatibility with old script that specify both --full-index and --abbrev, always show full object id if --full-index is specified.

diff-options now includes in its man page:

In diff-patch output format, --full-index takes higher precedence, i.e. if --full-index is specified, full blob names will be shown regardless of --abbrev.

Non default number of digits can be specified with --abbrev=<n>.


The documentation on the "--abbrev=<n>" option did not say the output may be longer than "<n>" hexdigits, which has been clarified with Git 2.30 (Q1 2021).

See commit cda34e0 (04 Nov 2020) by Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit ee13beb, 11 Nov 2020)

doc: clarify that --abbrev=<n> is about the minimum length

Helped-by: Derrick Stolee

Early text written in 2006 explains the "--abbrev=<n>" option to "show only a partial prefix", without saying that the length of the partial prefix is not necessarily the number given to the option to ensure that the output names the object uniquely.

Update documentation for the diff family of commands, "blame", "branch --verbose", "ls-files" and "ls-tree" to stress that the short prefix must uniquely refer to an object, and is merely the mininum number of hexdigits used in the prefix.

diff-options now includes in its man page:

lines, show the shortest prefix that is at least '<n>' hexdigits long that uniquely refers the object.

git blame now includes in its man page:

abbreviated object name, use <m>+1 digits, where <m> is at least <n> but ensures the commit object names are unique.

git branch now includes in its man page:

--abbrev=<n>

In the verbose listing that show the commit object name, show the shortest prefix that is at least '<n>' hexdigits long that uniquely refers the object.

like image 40
VonC Avatar answered Oct 18 '22 03:10

VonC