One of our git repositories is large enough that a git-clone takes an annoying amount of time (more than a few minutes). The .git directory is ~800M. Cloning always happens on a 100Mbps lan over ssh. Even cloning over ssh to localhost takes more than a few minutes.
Yes, we store data and binary blobs in the repository.
Short of moving those out, is there another way of making it faster?
Even if moving large files our were an option, how could we do it without major interruption rewriting everyone's history?
A large Git repository can be a repository that contains a large number of files in its head commit. This can negatively affect the performance of virtually all local Git operations. A common mistake that leads to this problem is to add the source code of external libraries to a Git repository.
Git has no limit on repo size, but repository managers typically do cap repository sizes unless you work out special arrangements with them: Bitbucket – 2 Gb, GitHub – 2 Gb with provisions for 5 Gb, GitLab and Azure DevOps – 10 Gb.
I faced the same situation with a ~1GB repository, needing to be transferred over DSL. I went with the oft-forgotten sneakernet: putting it on a flash drive and driving it across town in my car. That isn't practical in every situation, but you really only have to do it for the initial clone. After that, the transfers are fairly reasonable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With