Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git pack filenames -- what is the digest?

Tags:

git

pack

Git stores individual objects in .git/objects/ab/cdefgh... where ab is the first byte of the SHA1 digest.

However, pack files don't follow the same naming policy, and I can find no documentation on how it is named. Any insights?

like image 611
romul Avatar asked Mar 29 '11 08:03

romul


People also ask

What is in a Git pack file?

The packfile is a single file containing the contents of all the objects that were removed from your filesystem. The index is a file that contains offsets into that packfile so you can quickly seek to a specific object.

What is .Git object folder?

objects − This folder represents an object database of Git. config − This is the local configuration file. refs − This folder stores information about tags and branches.

What does Git blob mean?

Blob is an abbreviation for “binary large object”. When we git add a file such as example_file. txt , git creates a blob object containing the contents of the file. Blobs are therefore the git object type for storing files.

What are the Git objects?

There are 3 main types of objects that git stores: Blob: This object as we have seen above stores the original content. Tree: This object is used to store directories present in our project. Commit: This object is created whenever a commit is made and abstracts all the information for that particular commit.


2 Answers

The pack files are kept in objects/pack, which is documented in gitrepository layout. Within this directory, they are stored as pairs of an index file and the pack file itself, called, for example:

pack-a862cfa8b080773290073999c800a2e655ef9b5d.idx
pack-a862cfa8b080773290073999c800a2e655ef9b5d.pack

How the SHA1sum in those filenames is calculated is explained in the git-pack-objects documentation (my emphasis):

Write into a pair of files (.pack and .idx), using <base-name> to determine the name of the created file. When this option is used, the two files are written in <base-name>-<SHA1>.<pack,idx> files. <SHA1> is a hash of the sorted object names to make the resulting filename based on the pack content, and written to the standard output of the command.

The object names are the SHA1sums of the objects within the pack file.

like image 135
Mark Longair Avatar answered Sep 20 '22 17:09

Mark Longair


The answer is either "the SHA-1 hash of the entire pack file, minus the last 20 bytes", or "a hexadecimal digest of the last 20 bytes" (both are equivalent).

The last 20 bytes of the file is the "trailer checksum" which itself is a SHA-1 hash of the entirety of the file (minus the last 20 bytes).

This was changed in 2013 (previously it was the SHA-1 sum of all the hashes in the file). Note that the documentation now simply reads " is a hash based on the pack content". The author explicitly does not guarantee how the SHA-1 is calculated (from the commit log: "Hopefully this will discourage readers from depending on the old or the new calculation.").

like image 40
mgiuca Avatar answered Sep 19 '22 17:09

mgiuca