Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Git use SHA-1 as version numbers?

Tags:

git

Git uses SHA-1 for the user to refer a commit.

Subversion (SVN) and Mercurial (hg) use an incremental number.

Why did the Git team make that design decision of using SHA-1 instead of something more descriptive?

like image 422
jperelli Avatar asked Jun 27 '12 19:06

jperelli


People also ask

Why does Git still use SHA-1?

GIT strongly relies on SHA-1 for the identification and integrity checking of all file objects and commits. It is essentially possible to create two GIT repositories with the same head commit hash and different contents, say a benign source code and a backdoored one.

Does Git use SHA-1 or sha256?

At its core, the Git version control system is a content addressable filesystem. It uses the SHA-1 hash function to name content.

What does SHA mean in Git?

"SHA" stands for Simple Hashing Algorithm. The checksum is the result of combining all the changes in the commit and feeding them to an algorithm that generates these 40-character strings. A checksum uniquely identifies a commit.

Why does Git use hash?

Git uses SHA-1-generated hashes to identify revisions and protect code against corruption. Unfortunately, SHA-1's foundation has been weakened by a series of vulnerabilities that have been found in the codebase, and is considered broken.


1 Answers

Mercurial (hg) also uses SHA1 hashes. It just also tries to get revision numbers out of the commit history. However, these revisions are only valid in one repository. If you watch another repo, these revisions are not guaranteed to match.

As to why git and mercurial use hashes: both have a non-linear commit history. because both are distributed, people can work on the same code basis in their local repositories without having to synchronize to a central authority (as required by SVN and CVS). Now if people commit their stuff locally and merge them later, you will have a hard time to come up with a consistent schema to form linearly increasing integer revisions. And even if you could, you would still get different result between different repos.

In the end, it's all because of the distributed nature. It is a simple way to come up with rather unique identifiers to commits. And as a side-product you can also encode the complete history towards a single commit into the hash. Which means, even if you have the same diff in a commit, you probably will get different SHA1 hashes.

like image 102
Holger Just Avatar answered Oct 04 '22 19:10

Holger Just