Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the advantage of git lfs?

Tags:

git

git-lfs

Github has a limit on push large file. So if you want to push a large file to your repo, you have to use Git LFS.

I know it's a bad idea to add binary file in git repo. But if I am using gitlab on my server and there is no limit of file size in a repo, and I don't care the repo size to be super large on my server. In this condition, what's the advantage of git lfs?git clone or git checkout will be faster?

like image 839
Sanster Avatar asked Feb 23 '16 10:02

Sanster


People also ask

Does Git LFS track changes?

gitignore file, as Git LFS tracks new files, updates are automatically made to the . gitattributes file. To make sure the changes are being tracked, each time the . gitattributes file is updated, it needs to be staged and commited, otherwise issues may occur later on.

Is Git LFS secure?

A critical vulnerability (CVE-2020-27955) in Git Large File Storage (Git LFS), an open source Git extension for versioning large files, allows attackers to achieve remote code execution if the Windows-using victim is tricked into cloning the attacker's malicious repository using a vulnerable Git version control tool, ...

Does Git LFS reduce size?

Git LFS does not compress files. Some files are compressible, and some are not. It, like Git's partial clone feature, is designed to offload most of the data to a trusted server for the purposes of making local access lighter and cheaper.

Does Git LFS cost money?

One data pack costs $5 per month, and provides a monthly quota of 50 GB for bandwidth and 50 GB for storage. You can purchase as many data packs as you need.


1 Answers

One specificity of Git (and other distributed systems) compared to centralized systems is that each repository contains the whole history of the project. Suppose you create a 100 MB file, modify it 100 times in a way that doesn't compress well. You'll end up with a 10 GB repository. This means that each clone will download 10 GB of data, eat 10 GB of disk space on each machine on which you're making a clone. What's even more frustrating: you'd still have to download these 10 GB of data even if you git rm the big files.

Putting big files in a separate system like git-lfs allow you to store only pointers to each version of the file in the repository, hence each clone will only download a tiny piece of data for each revision. The checkout will download only the version you are using, i.e. 100 MB in the example above. As a result, you would be using disk space on the server, but saving a lot of bandwidth and disk space on the client.

In addition to this, the algorithm used by git gc (internally, git repack) does not always work well with big files. Recent versions of Git made progress in this area and it should work reasonably well, but using a big repository with big files in it may eventually get you in trouble (like not having enough RAM to repack your repository).

like image 180
Matthieu Moy Avatar answered Sep 21 '22 04:09

Matthieu Moy