Let's say I committed a binary file, then changed it a couple of commits later and now I changed it back in a new commit.
Out of curiosity, I wondered if git creates a new blob for it? Or does it detect it's in the history and reuse it? If so, how does it detect that? Checksum?
Git will reuse the same blob.
I have done a test. I made 3 commits. First I commit a binary file, then I modified the binary file and commit it again. Then finally I overwrote the file by original binary used in the first commit and commit again.
The binary files content in 1st & 3rd commits are the same. Each commit is the HEAD of the follow branches:
1st commit: "FIRST". 2nd commit: "SECOND". 3rd commit: "master"
Then if you run "git cat-file -p FIRST^{tree}" , it shows hash code of the binary file.
$ git cat-file -p FIRST^{tree}
100644 blob ec049240a47b472bd7c31d1fa27118c4fe2f1229 test.db3
$ git cat-file -p SECOND^{tree}
100644 blob a47bb3727e5aefe3ec386bec5520f3e4ffb3a4c5 test.db3
$ git cat-file -p master^{tree}
100644 blob ec049240a47b472bd7c31d1fa27118c4fe2f1229 test.db3
You will find that the hash code of the blob of 1st and 3rd commit are the same.
git is so smart enough to check whether a blob exists for a hash code and reuse that blob if found.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With