I thought that git lfs migrate
rewrote the history of a repo so that specified large files were kept in LFS. This means that the repo should get smaller, because it doesn't directly contain all versions of large files. However, when I run
git lfs migrate import --include="test-data/**" --include-ref=refs/heads/master
All of the files in the test-data/
directory are replaced with files that look like this:
version https://git-lfs.github.com/spec/v1
oid sha256:5853b5a2a95eaca53865df996aee1d911866f754e6089c2fe68875459f44dc55
size 19993296
And the .git folder becomes twice as large (400MB to 800MB). I am confused. What's git lfs migrate doing
?
Edit: I did clean after migration
git reflog expire --expire-unreachable=now --all
git gc --prune=now
before running du
. Afterwards, most of the space is used by these folders:
414M .git/objects 398M .git/lfs
IMPORT. The import mode migrates objects present in the Git history to pointer files tracked and stored with Git LFS. It supports all the core migrate options and these additional ones: ○
Git LFS can be used when you want to version large files, usually, valuable output data, which is larger than Github limit (100Mb). These files can be plain text or binaries.
Git LFS stores the binary file content on a custom server or via GitHub, GitLab, or BitBucket's built-in LFS storage. To find the binary content's location, look in your repository's . git/lfs/objects folder.
The only problem is that the original git-objects of the binary files are still in the .git
folder because you didn't garbage-collected them.
You should follow the git lfs migration tutorial which explains:
The above successfully converts pre-existing git objects to lfs objects. However, the regular objects still persist in the .git directory. These will be cleaned up eventually by git, but to clean them up right away, run:
git reflog expire --expire-unreachable=now --all
git gc --prune=now
After running that your .git should be the same size, but if you'll go into it you should see that objects
should be now much smaller than before the migrations and that lfs
holds the rest.
The even better news is that now when other developers/applications clone the repo they will only have to download the objects
directory and will then fetch only the "large-files" which they check out, not the whole history.
I thought that git lfs migrate rewrote the history of a repo so that specified large files were kept in LFS.
Perfectly true.
This means that the repo should get smaller, because it doesn't directly contain all versions of large files.
Not exactly true. The promise of git lfs
is not that your repo will be smaller but that when you clone, you won't have to download all the git objects so the clone will be smaller and faster. Because for the file managed by git-lfs
, only the files that should appear in your working directory will be downloaded during the git checkout
.
All of the files in the test-data/ directory are replaced with files that look like this:
That's how git-lfs works. Instead of committing the file in the repository, it commit a this "pointer" file that contains the id of the object. The content of the file is stored in the .git/lfs/objects
folder. And these objects will be uploaded to the server when you will git push
.
And the .git folder becomes twice as large (400MB to 800MB). I am confused.
Because all the files managed by git lfs are stored in this folder it could become huge.
I also think it double the size of your repository because the objects are stored twice for the moment. In the .git/objects
until you ditch the old history (by purging the reflog and doing a git gc
. But do that once you are sure your lfs migration is a success) and in .git/lfs/objects
because you made the git lfs conversion.
I think (but I'm not sure) that .git/lfs/objects
serve as a cache folder so once you pushed all the new history and so it uploaded the files managed by lfs, you could delete it to reduce the size of your repository.
But if I were you, I will not do that!
To see the real effect of git lfs on your local repository, once you --force
pushed the new history (and that the old one is no more in the remote repository), I will do a fresh clone. And now, your local repository should be smaller.
But the folder .git/lfs/objects
will still grow in the future every time a new version of these files is downloaded (but it should always stay smaller than if you didn't use git lfs).
I hope you better understand how it works...
PS:
All of the files in the test-data/ directory are replaced with files that look like this:
I hope that what you said is partly false. That your files in test-data/
still contains the good content but what you report is what a git
command show you...
Could you confirm? Or you have a problem... That could be explained by not having git lfs
installed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With