There are many SO questions regarding "how to remove an accidentally added big file from repo", many of them suggesting using git gc
command. However, I find it not working for me and I don't know what's going wrong.
Here is what I have done:
$ git init
Initialized empty Git repository in /home/wzyboy/git/myrepo/.git/
$ echo hello >> README
$ git add README
$ git commit -a -m 'init commit'
[master (root-commit) f21783f] init commit
1 file changed, 1 insertion(+)
create mode 100644 README
$ du -sh .git
152K .git
$ cp ~/big.zip .
$ git add big.zip
$ git commit -a -m 'adding big file'
[master 3abd0a4] adding big file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 big.zip
$ du -sh .git
77M .git
$ git log --oneline
3abd0a4 adding big file
f21783f init commit
$ git reset --hard f21783f
HEAD is now at f21783f init commit
$ git log --oneline
f21783f init commit
$ git gc --aggressive --prune=all
Counting objects: 6, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), done.
Total 6 (delta 0), reused 0 (delta 0)
$ git gc --aggressive --prune=now
Counting objects: 6, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), done.
Total 6 (delta 0), reused 6 (delta 0)
$ du -sh .git
77M .git
$ git version
git version 2.2.2
In the console output above, I created a new git repo, added one small text file and the .git
directory is 152K in size, so far so good. Then I added a big file into the repo and the directory bloats to 77M. However, aftering my attempting to remove the big file (git reset --hard
or git rebase -i
), I cannot recover the disk space claimed by the big file, no matter how I run git gc
with different options.
Could any one tell me why git gc
does not work in my case? What should I do to recover the disk space? Is it possible to recover the disk space using git gc
instead of git filter-branch
?
Thanks.
Git prune is used to delete Git objects that the git gc config has judged unreachable. Learn more about the git prune command.
--prune=now prunes loose objects regardless of their age and increases the risk of corruption if another process is writing to the repository concurrently; see "NOTES" below. --prune is on by default.
As Andrew C suggested, one needs to expire reflog to dereference the objects before git gc
being able to recycle the loose objects. So the correct way to recover the disk space claimed by accidentally added big files is:
git reflog expire --expire=now --all
git gc --aggressive --prune=now
This will remove all the reflogs, so use with caution.
One tip which can help avoiding any typo, with Git 2.18 (Q2 2018) is avoiding a gc prune
with non-existing reference (called here: "nonsense
")
"git gc --prune=nonsense
" spent long time repacking and then silently failed when underlying "git prune --expire=nonsense
" failed to parse its command line.
This has been corrected.
See commit 96913c9 (23 Apr 2018) by Junio C Hamano (gitster
).
Helped-by: Linus Torvalds (torvalds
).
(Merged by Junio C Hamano -- gitster
-- in commit 3915f9a, 08 May 2018)
parseopt
: handle malformed--expire
arguments more nicelyA few commands that parse
--expire=<time>
command line option behave sillily when given nonsense input.
For example$ git prune --no-expire Segmentation falut $ git prune --expire=npw; echo $? 129
Both come from
parse_opt_expiry_date_cb()
.The former is because the function is not prepared to see
arg==NULL
(for "--no-expire
", it is a norm; "--expire
" at the end of the command line could be made to passNULL
, if it is told that the argument is optional, but we don't so we do not have to worry about that case).The latter is because it does not check the value returned from the
underlying parse_expiry_date()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With