Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove unused objects from a git repository?

I accidentally added, committed and pushed a huge binary file with my very latest commit to a Git repository.

How can I make Git remove the object(s) that was/were created for that commit so my .git directory shrinks to a sane size again?

Edit: Thanks for your answers; I tried several solutions. None worked. For example the one from GitHub removed the files from the history, but the .git directory size hasn't decreased:

$ BADFILES=$(find test_data -type f -exec echo -n "'{}' " \;)  $ git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch $BADFILES" HEAD Rewrite 14ed3f41474f0a2f624a440e5a106c2768edb67b (66/66) rm 'test_data/images/001.jpg' [...snip...] rm 'test_data/images/281.jpg' Ref 'refs/heads/master' was rewritten  $ git log -p # looks nice  $ rm -rf .git/refs/original/ $ git reflog expire --all $ git gc --aggressive --prune Counting objects: 625, done. Delta compression using up to 2 threads. Compressing objects: 100% (598/598), done. Writing objects: 100% (625/625), done. Total 625 (delta 351), reused 0 (delta 0)  $ du -hs .git 174M    .git $ # still 175 MB :-( 
like image 614
Jonas H. Avatar asked Sep 26 '10 13:09

Jonas H.


People also ask

Can I delete git objects pack?

What you are looking to do is called rewriting history, and it involved the git filter-branch command. This will remove all references to the files from the active history of the repo. Next step, to perform a GC cycle to force all references to the file to be expired and purged from the packfile.

How can I clear up a git Reflog?

git reflog expire --expire-unreachable=now --all removes all references of unreachable commits in reflog . git gc --prune=now removes the commits themselves. Attention: Only using git gc --prune=now will not work since those commits are still referenced in the reflog. Therefore, clearing the reflog is mandatory.


1 Answers

I answered this elsewhere, and will copy here since I'm proud of it!

... and without further ado, may I present to you this useful script, git-gc-all, guaranteed to remove all your git garbage until they might come up with extra config variables:

git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 \   -c gc.rerereresolved=0 -c gc.rerereunresolved=0 \   -c gc.pruneExpire=now gc "$@" 

The --aggressive option might be helpful.

NOTE: this will remove ALL unreferenced thingies, so don't come crying to me if you decide later that you wanted to keep some of them!

You might also need to run something like these first, oh dear, git is complicated!!

git remote rm origin rm -rf .git/refs/original/ .git/refs/remotes/ .git/*_HEAD .git/logs/ git for-each-ref --format="%(refname)" refs/original/ |   xargs -n1 --no-run-if-empty git update-ref -d 

I put all this in a script, here:

http://sam.nipl.net/b/git-gc-all-ferocious

like image 154
Sam Watkins Avatar answered Oct 26 '22 16:10

Sam Watkins