Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does a git commit hash equal a repository state?

Each git commit is attributed a hash which "signs" its content. Does it also sign where the commit came from or is it just the commit data itself which is used for the hash calculation?

Differently phrased: is it impossible (apart from hash collisions) to forge a second repository with its head commit having the exact same hash and same content but the rest of the tree differing?

like image 766
PhilLab Avatar asked May 27 '15 11:05

PhilLab


People also ask

What is a Git commit hash?

At a high level, a git commit hash is a SHA1 hash of the state of the git repository at the time of the commit. A short git commit hash is an abbreviation of the hash to the first 7 characters, it is almost certainly unique within a repository and git will increase the number of characters used if it is not.

Does a commit change a repo?

The "commit" command is used to save your changes to the local repository.

How does Git determine commit hash?

The commit hash by hashing the data you see with cat-file . This includes the tree object hash and commit information like author, time, commit message, and the parent commit hash if it's not the first commit.

How do Git hashes work?

Git uses hashes in two important ways. When you commit a file into your repository, Git calculates and remembers the hash of the contents of the file. When you later retrieve the file, Git can verify that the hash of the data being retrieved exactly matches the hash that was computed when it was stored.


1 Answers

The answer to the second question is yes (it is impossible, etc).

The first question is not as well formed as I think you might want, because a commit hash is in fact just based on the commit data. The key that causes the second question's answer is that "the commit data" includes these key items, which you can see in an actual commit:

$ git cat-file -p HEAD
tree 22abd5c3fed5e2f49fb71e10b39d8c4929e51fc7
parent 4ebdeb68ba87282f87c39d790ba17fe1e021cc97
parent 9eabf5b536662000f79978c4d1b6e4eff5c8d785
[snip]

The tree line gives the hash of the tree (which depends only on the tree contents) and the parent lines—two, in this case, as HEAD is a merge commit—give the hashes of the parent commits. Given that the hash of the current commit depends on the hash(es) of its tree and parent(s), if you were to construct a different repo with a different history or different tree, those would have different hashes so that the commit would also have a different hash.

(The technical term usually used here is Merkle Tree.)

like image 158
torek Avatar answered Oct 29 '22 17:10

torek