Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Git, what is the difference between long and short hashes?

Tags:

git

hash

Here is the long Git hash:

commit c26cf8af130955c5c67cfea96f9532680b963628

Merge: 8654907 37c2a4f

Author: nicolas

Date: Wed Apr 26 13:28:22 2017 -0400

Here is the short one:

Enter image description here

like image 397
Nicolas S.Xu Avatar asked Apr 27 '17 18:04

Nicolas S.Xu


People also ask

How many characters is a short git hash?

Generally, eight to ten characters are more than enough to be unique within a project. One of the largest Git projects, the Linux kernel, is beginning to need 12 characters out of the possible 40 to stay unique. 7 digits are the Git default for a short SHA, so that's fine for most projects.

Does GIT use SHA-1 or sha256?

Background. At its core, the Git version control system is a content addressable filesystem. It uses the SHA-1 hash function to name content.

What determines git hash?

As a summary and reminder, creating a Git commit hash consists of getting: The object ID of the file, which involves hashing the file contents with SHA-1. In Git, hash-object provides this ID. The object entries that go into the tree object.

How long is a git SHA?

Every time a commit is added to a git repository, a hash string which identifies this commit is generated. This hash is computed with the SHA-1 algorithm and is 160 bits (20 bytes) long. Expressed in hexadecimal notation, such hashes are 40 digit strings.


4 Answers

To elaborate a bit more about why the short hash is useful, and why you often don't need the long hash, it has to do with how Git stores things.

c26cf8af130955c5c67cfea96f9532680b963628 will be stored in one of two places. It could be in the file .git/objects/c2/6cf8af130955c5c67cfea96f9532680b963628. Note that the first two characters, c2, make up a directory and the rest is the filename. Since many filesystems don't perform well when there's too many files in one directory, this prevents any one directory from having too many files in it and keeps this little directory database efficient.

With just the short hash, c26cf8a, git can do the equivalent of .git/objects/c2/6cf8a* and that's likely to be a single file. Since the objects are subdivided into subdirectories, there's not too many filenames to look through to check if there's more than one match.

c26cf8a alone contains enough possibilities, 16^7 or 2^28 or 268,435,456 that it's very unlikely another commit will share that prefix.

Basically, Git uses the filesystem itself as a simple key/value store, and it can look up partial keys without having to scan the whole list of keys.


That's one way to store objects. More and more, Git stores its objects in packfiles. It's a very efficient way to store just the changes between files. From time to time, your Git repository will examine what's in .git/objects and store just the differences in .git/objects/pack/pack-<checksum>.

That's a binary format, I'm not going to get into it here, and I don't understand it myself anyway. :)

like image 78
Schwern Avatar answered Oct 16 '22 16:10

Schwern


A short hash is just the first 7 characters of your full hash.

Right below the circled commit in your screenshot, you can see a commit labeled c26cf8a. This should be the commit c26cf8af130955c5c67cfea96f9532680b963628 you were looking for.

like image 21
Stevoisiak Avatar answered Oct 16 '22 16:10

Stevoisiak


The short hash shows the first seven characters of the hash. The short hash of c26cf8af130955c5c67cfea96f9532680b963628 is c26cf8a in the second row. See the documentation:

Git is smart enough to figure out what commit you meant to type if you provide the first few characters, as long as your partial SHA-1 is at least four characters long and unambiguous.

like image 10
jschnasse Avatar answered Oct 16 '22 17:10

jschnasse


The short hash is just a shorter version of the original (long) hash. The original Git hash is 40 bytes long while short is only 8 bytes long. However, it becomes difficult to manage it (in terms of using typing or displaying), and that's why the short version is used.

This approach is used in almost all the projects where a hash is either for integrity (package distribution), versioning (Git and SVN) or layered architecture (Docker).

like image 2
radbrawler Avatar answered Oct 16 '22 15:10

radbrawler