We use git to distribute an operating system and keep it upto date. We can't distribute the full repository since it's too large (>2GB), so we have been using shallow clones (~300M). However recently when fetching from a shallow clone, it's now inefficiently fetches the entire >2GB repository. This is an untenable waste of bandwidth for deployments.
The git documentation says you cannot fetch from a shallow repository, though that's strictly not true. Are there any workarounds to make a git clone --depth 1
able to fetch just what's changed from it? Or some other strategy to keep the distribution size as small as possible whilst having all the bits git needs to do an update?
I have unsuccessfully tried cloning from --depth 20
to see if it will upgrade more efficiently, that didn't work. I did also look into http://git-scm.com/docs/git-bundle, but that seems to create huge bundles.
If the source repository is shallow, fetch as much as possible so that the current repository has the same history as the source repository. --update-shallow. By default when fetching from a shallow repository, git fetch refuses refs that require updating . git/shallow. This option updates .
You can force recompression by passing the -F option to git-repack(1). Given ample network bandwidth, this will in fact result in faster clones.
In the simplest terms, git pull does a git fetch followed by a git merge . git fetch updates your remote-tracking branches under refs/remotes/<remote>/ .
--depth
is a git fetch
option. I see the doc doesn't really highlight that git clone
does a fetch.
When you fetch, the two repos swap info on who has what by starting from the remote's heads and searching backward for the most recent shared commit in the fetched refs' histories, then filling in all the missing objects to complete just the new commits between the most recent shared commits and the newly fetched ones.
A --depth=1
fetch just gets the branch tips and no prior history. Further fetches of those histories will fetch everything new by the above procedure, but if the previously-fetched commits aren't in the newly fetched history, fetch will retrieve all of it -- unless you limit the fetch with --depth
.
Your client did a depth=1 fetch from one repo and switched urls to a different repo. At least one long ancestry path in this new repo's refs apparently shares no commits with anything currently in your repo. That might be worth investigating, but either way unless there's some particular reason, your clients can just do every fetch --depth=1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With