Why does git store objects in directories with the first two characters of the hash?

Tags:

directory-structure

I'm designing a directory structure based on UUIDs so I'm looking at what git does to see if it would be a good model.

I can see that git stores objects in a structure where the first two characters of the hash are used as a directory and the rest of the hash is the file name.

What I'm wondering is why? If there's a big advantage to using the directories why aren't more subdirectories created... say a directory for each one or two characters in the hash creating a tree? If there isn't a big advantage then why the directory with the first two chars?

687

asked Sep 11 '13 02:09

Monte Goulding

1 Answers

Git switches from "loose objects" (in files named like 01/23456789abcdef0123456789abcdef01234567) to "packs" when the number of loose objects exceeds a magic constant (6700 by default but configurable, gc.auto). Since SHA-1 values tend to be well-distributed it can approximate total loose objects by looking in a single directory. If there are more than (6700 + 255) / 256 = 27 files in one of the object directories, it's time for a pack-file.

Thus, there's no need for additional fan-out (01/23/4567...): it's unlikely that you will get that many objects in one directory. And in fact, greater fan-out would tend to make it harder to detect that it is time for an automatic packing, unless you set the threshold value higher (than 6700), because (27 + 255) / 256 is 1—so you'd want to count everything in 01/*/ instead of just 01/.

One could use 0/1234567... and allow up to ~419 objects per directory to get the same behavior, but linear directory scans (on any system that still uses those) are O(n²), and 27² is a mere 729, while 419² is 175561. [Edit: that only applies to file creation, where you have a two stage search, once to find that it's OK to create and a second to find a slot or append. Lookups are still O(n).]

185

answered Oct 10 '22 05:10

torek

Related questions
                            
                                Remove file from Version Control in IntelliJ IDEA
                            
                                git Push failed: Failed with error: ssh variant 'simple' does not support setting port
                            
                                How do I git-revert from the command line?
                            
                                How to publish to Github Pages from Travis CI?
                            
                                How to build Docker Images with Dockerfile behind HTTP_PROXY by Jenkins?
                            
                                Retrieve lost HEAD branch in git
                            
                                Saving ssh key fails
                            
                                git post-receive hook not running
                            
                                Removing files from git history - bad revision error
                            
                                Error - "There is no script engine for file extension .vbs" when using "Git Bash Here" in Windows 7
                            
                                First time using node.js - "ReferenceError: node is not defined"
                            
                                .gitignore for visual studio project is not working
                            
                                Delete multiple git remote tags and push once
                            
                                Git is deleting an ignored file when i switch branches
                            
                                How should I update the version inside my pom.xml when releasing using git flow?
                            
                                Tree contains duplicate file entries
                            
                                git archive fatal: Operation not supported by protocol
                            
                                Sonarqube: Missing blame information for the following files
                            
                                What's the best way to force a merge in git?
                            
                                Git conflict (rename/rename)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With