Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my Git repository so much larger than Mercurial version?

Tags:

git

mercurial

I've converted a Mercurial repository to Git, using fast-export. But the Git repository is huge: 18 GB for Git vs. 3.4 GB for Mercurial. None of my cleanup steps have helped.

My Mercurial repository is dominated by one 65 MB file (Anki flashcards in SQLite format) that gets updated daily. Its history has grown to be 2.9 GB, under .hg/store/data.

I was hoping Git might be able to compress the history a little better, but I have been unable to shrink the repository below 18 GB!

I have tried git prune, git gc, and others, to no avail. I even tried zipping the .git folder, and it still came out to be exactly 18 GB.

Am I missing something?

Update: I tried Bazaar (bzr), and it compressed my repository to only 2.3 GB. Nice!

like image 739
slattery Avatar asked Aug 06 '11 22:08

slattery


1 Answers

One reason could be that Mercurial has a very compact storage format that involves diffs, even for binaries. And since using diffs to re-create versions can be very time consuming, it will store a full snapshot as soon as the diffs+old original exceed the double the size of a full snapshot.

Personally, I would try storing a dump of your sqlite database instead of the database file itself and see where that gets you. It might be far more efficient.

I do not know what git's storage format is. But I'm guessing it does not involve diffs in the same way as Mercurial's does.

like image 176
Omnifarious Avatar answered Sep 20 '22 08:09

Omnifarious