My company is considering implementing Git but I have a question about what the best way would be to set it up. We have 3 sites and are planning on using Gerrit2 to create mirrors. Our repository is about 2GB and we would like to start adding binaries to it. I'm concerned about the space usage though. I don't mind if all versions of the binaries are stored in a handful of locations but I want to make sure that they don't bog down clone operations.
I understand that Git uses hard links but I think that will only work if we place a copy of the repository on every mount. Are there better options and if so what are the tradeoffs? Options that I'm looking at are "--shared" and "--reference".
Another alternative to git media
mentioned by Marcelo is git annex:
See what git-annex is not:
git-annex is not git-media, although they both approach the same problem from a similar direction. I only learned of git-media after writing git-annex, but I probably would have still written git-annex instead of using it.
Currently, git-media has the advantage of using git smudge filters rather than git-annex's pile of symlinks, and it may be a tighter fit for certain situations.
It lacks git-annex's support for widely distributed storage, using only a single backend data store.
It also does not support partial checkouts of file contents, like git-annex does.
Note: abdelsaid adds in the comments:
You can use git-annex with bup (bup allows you to have versions), see git-annex/ special remotes/ bup
(and Using bup)
I have presented bup in more details in "git with large files"
To just use native git, use a separate repo to house the binaries via git submodules. This has worked for me on IVR systems which had a ton of gigantic .wav files. If you need further clarification, feel free to contact me.
Here's a good write up on them:
http://progit.org/book/ch6-6.html
hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With