Both Git and GitHub display short versions of SHAs -- just the first 7 characters instead of all 40 -- and both Git and GitHub support taking these short SHAs as arguments.
E.g. git show 962a9e8
E.g. https://github.com/joyent/node/commit/962a9e8
Given that the possibility space is now orders of magnitude lower, "just" 268 million, how do Git and GitHub protect against collisions here? And how do they handle them?
To answer your question, yes, git will treat them as the same, changing the hash algorithm won't help, it'll take a "second check" of some sort, but ultimately, you would need as much "additional check" data as the length of the data to be 100% sure...
If two distinct objects have the same hash, this is known as a collision. Git can only store one half of the colliding pair, and when following a link from one object to the colliding hash name, it can't know which object the name was meant to point to. Two objects colliding accidentally is exceedingly unlikely.
GIT strongly relies on SHA-1 for the identification and integrity checking of all file objects and commits. It is essentially possible to create two GIT repositories with the same head commit hash and different contents, say a benign source code and a backdoored one.
Generally, eight to ten characters are more than enough to be unique within a project. One of the largest Git projects, the Linux kernel, is beginning to need 12 characters out of the possible 40 to stay unique. 7 digits are the Git default for a short SHA, so that's fine for most projects.
These short forms are just to simplify visual recognition and to make your life easier. Git doesn't really truncate anything, internally everything will be handled with the complete value. You can use a partial SHA-1 at your convenience, though:
Git is smart enough to figure out what commit you meant to type if you provide the first few characters, as long as your partial SHA-1 is at least four characters long and unambiguous — that is, only one object in the current repository begins with that partial SHA-1.
I have a repository that has a commit with an id of 000182eacf99cde27d5916aa415921924b82972c
.
git show 00018
shows the revision, but
git show 0001
prints
error: short SHA1 0001 is ambiguous. error: short SHA1 0001 is ambiguous. fatal: ambiguous argument '0001': unknown revision or path not in the working tree. Use '--' to separate paths from revisions
(If you're curious, it's a clone of the git repository for git itself; that commit is one that Linus Torvalds made in 2005.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With