Here is the situation. Ad-hock analytic repository with a directory per each individual analysis. Each directory contains a script(s) connected with one or more data files that come in different formats and are of different (sometimes considerable) size. Scripts without data are generally useless so we would like to store data files. On the other hand sometimes it's useful to look at the script without being forced to download associated data files(s) (to determine how some analysis were conducted).
We definitely don't want to store data on a separated repository (runtime issues, associating scripts with data files etc.)
What was analyzed:
The idea that comes to me is that it would be convenient to exclude some locations or certain files (i.e. >> 50 MB) from being pulled or cloned from repository. Just not to transfer unwanted data. Is it possible?
If some files are not touched over subsequent commits they are not necessary from the perspective of future pushes. Probably (or even for sure) I'm lacking certain knowledge about underlying mechanisms of git. I would be thankful for clarification.
git clone --no-checkout --filter=blob:limit=100m
This will actually allow fetching only files smaller than a given size when servers finally implement it.
Then you have to checkout all files but the big ones. A simple strategy that will likely work will be to git rev-list --filter=blob:limit=100 | xargs
, but I'm lazy to test it now.
See this answer for more details: How do I clone a subdirectory only of a Git repository?
git LFS
This is a solution that can already be used on GitHub and GitLab.
You just track your large blobs in LFS, and then clone without LFS How to clone/pull a git repository, ignoring LFS?
GIT_LFS_SKIP_SMUDGE=1 git clone SERVER-REPOSITORY
and finally manually pull any missing LFS files that you may want: https://github.com/git-lfs/git-lfs/issues/1351
git lfs pull --include "*.dat"
Git sparse checkout lets you set subdirs to checkout or not, etc. I don't think it can do it based on anything else (e.g. size) though AFAIK.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With