Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove big files from old commits in bitbucket

my bitbucket repo got very big (1.6 GB) after I did some awful commits some months ago. I did not realized how serious was the situation (noob..), since a collegue tried to clone it and failed (too big).

I read carefully this post Why is my git repository so big? and did the following (as @Vi suggested):

  • Detect fat files in my repo history

    git rev-list --all --objects |     sed -n $(git rev-list --objects --all | \
    cut -f1 -d' ' | \
    git cat-file --batch-check | \
    grep blob | \
    sort -n -k 3 | \
    tail -n40 | \
    while read hash type size; do 
     echo -n "-e s/$hash/$size/p ";
    done) |
    sort -n -k1
    

    Let's say one of the fat files is mybigfile.gz

  • Delete mybigfile.gz from repo

    git filter-branch -f  --index-filter \
    'git rm --force --cached --ignore-unmatch mybigfile.gz' \
    -- --all
    rm -Rf .git/refs/original && \
    git reflog expire --expire=now --all && \
    git gc --aggressive && \
    git prune
    

Actually, it worked since now my local repo directory is 850MB. The problem is that the remote repository did not change size. Then I tried to

git push origin --force --all

but the situation got worse, now my remote repo is 2GB! How can I solve this awful situation? Do you suggest to create a new repo or is there something else I can do to sort it out?

Thank you.

EDIT: I try to formulate better the problem. Some months ago, I committed to my repo some big files, several times. When I realised it, I added these files to .gitignore. Then I kept committing to the repo without these file. I was not taking care to the bitbucket warning (your repo is too big). Now, I need to get rid of these files stored in old commits, both locally and remotely. I successfully cleaned up my local git directory with the procedure described above. My problem is that when I push to the remote master branch, the remote repo is not affected by the local clean up.

EDIT 2: I tried BFG repo cleaner on my local .git directory

java -jar bfg-1.12.3.jar --strip-blobs-bigger-than 100M

here the output.

According to this tutorial, this should be enough to remove blobs on remote repo, but actually this did not happen. Locally my repo is slim, but remotely is still huge. I think I'm missing just a step, but do not know how to do it. Do you think it is easier to just create a new repo?

like image 518
user123892 Avatar asked Oct 08 '15 10:10

user123892


People also ask

How do I delete a large file from a commit?

If the large file was added in the most recent commit, you can just run: git rm --cached <filename> to remove the large file, then. git commit --amend -C HEAD to edit the commit.

How do I remove old files from Git?

The easiest way to delete a file in your Git repository is to execute the “git rm” command and to specify the file to be deleted. Note that by using the “git rm” command, the file will also be deleted from the filesystem.


1 Answers

From the comments I understood that the problem is fixed locally, but not on remote. Let's do some mad science to force all objects to be dereferenced and garbage collected with the following commands (create backup first):

git reflog expire --expire=now --all
git gc --prune=now --aggressive
git push -f

Maybe this will clean up the remote repository.

like image 109
Ionică Bizău Avatar answered Nov 01 '22 23:11

Ionică Bizău