I have a large binary file in a git repository, which has been changed in a few commits. These commits also included changes to other files. I would like to have only the most recent version of the binary file in the repository, but would like to keep the history of the other files that were changed in these commits.
All of the commits in question have already been pushed to github, and pulled from their by other members of the team.
How can I do this?
EDIT: I don't believe this is a duplicate of the other referenced question. As noted in the comments below, I've looked at that question, but want to remove every version of the file except the most recent one. This criteria is not addressed in the answers to the other question.
The simplest way is to use The BFG Repo-Cleaner, a faster, simpler alternative to git-filter-branch
designed specifically for removing large files from Git repos.
You should follow the usage instructions carefully, but the main step is just this - download the Java jar (requires Java 7 or above) and run this command:
$ java -jar bfg.jar --strip-blobs-bigger-than 100MB my-repo.git
Any blob over 100MB in size will be totally removed from your repository's history - unless it is the version present in the file tree of your latest commit, so your latest version will be untouched, as you required.
The BFG is also 10-50x faster than git-filter-branch
.
Full disclosure: I'm the author of the BFG Repo-Cleaner.
Rather than trying to filter all but the latest version, just nuke the file from the history of your repo and re-add the most recent version:
Consider not tracking this file. Git isn't meant for large binary blobs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With