I'm migrating some personal project repositories to Git from Mercurial. One of the projects relies on some non-changing, but large, shapefiles and SQLite databases These files are important and need to live inside the repo so that anyone checking out the project has access to them. With Mercurial, this was easy to deal with; I used the largefiles extension. largefiles automatically handled file additions/changes by not trying to analyze the content of files larger than X in size. That is, I could do hg addremove
, and everything would just work.
Git, just like Mercurial, is not designed to track large files. However, I don't see a similar extension. I've looked into git-annex, but it seems like I need to manually keep track of the files (i.e., I can't just arbitrarily do git add -A
). Also, if I'm reading this right, git-annex seems to maintain large files in a completely separate repo. I want to keep the large files in my current repo in the directories they currently live.
How do people handle this situation? Surely there are lots of projects that need to track large files integral to the operation of the project. Will git-annex accomplish this, or do I need some other extension?
The only one git-like system designed to deal with large (even very very large) files is:
bup (see more in GitMinutes #24)
The result is an actual git repo, that a regular Git command can read.
I detail how bup
differs from Git in "git with large files".
Surely there are lots of projects that need to track large files integral to the operation of the project.
No there isn't. This is simply not what Git is designed for, and even git-annex
is a workaround which isn't entirely satisfactory: see "git-annex
with large files".
I mention other tools in "How to handle a large git repository?".
largefiles automatically handled file additions/changes by not trying to analyze the content of files larger than X in size.
How does this differ from core.bigFileThreshold? --
core.bigFileThreshold
Files larger than this size are stored deflated, without attempting delta compression. Storing large files without delta compression avoids excessive memory usage, at the slight expense of increased disk usage.
Default is 512 MiB on all platforms. This should be reasonable for most projects as source code and other text files can still be delta compressed, but larger binary media files won’t be."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With