Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the git storage model wasteful?

Tags:

I was reading about how git stores changes in The Git Object Model1.

It sounds like if I change one line in a file, it's going to re-store the entire file. Does this waste a lot of space compared to say, Subversion which only stores diffs?

(Or am I misunderstanding the storage model?)

1 As of 2011 when question was asked. Current closest link is Git Internals - Git Objects.

like image 941
Greg Avatar asked Sep 06 '11 14:09

Greg


People also ask

Is git space efficient?

Git is very efficient in storing text files, and only storing these files that were changed.

Does git store full files?

Git stores just the contents of the file for tracking history, and not just the differences between individual files for each change. The contents are then referenced by a 40 character SHA1 hash of the contents, which means it's pretty much guaranteed to be unique.

Does git store deltas or whole files?

Git does use deltas for storage. Not only that, but it's more efficient in it than any other system.

Does git store diffs?

When you commit, git stores snapshots of the entire file, it does not store diffs from the previous commit. As a repository grows, the object count grows exponentially and clearly it becomes inefficient to store the data as loose object files.


1 Answers

Git will eventually pack everything into delta-compressed archives during the regular course of its internal maintenance, at which point this is no longer an issue.

This isn't really an issue today though. Git's philosophy is that disk space is cheap, and it's better optimize for speed rather than storage efficiency. Chances are you'll be better served by a SCM which is twice as fast, as opposed to one which requires half the disk space.

See the Git Book's chapter on The Packfile as well as git repack and git-pack-objects.

like image 175
meagar Avatar answered Oct 15 '22 15:10

meagar