I am trying to reduce the size of a largish repo (~3.4 G) and bfg-repo-cleaner seemed like a perfect tool to to reduce the size of it.
I ran the tool as described in the docs but am only seeing minor reductions in the size of the repo. What is particularly surprising is that some (but not all) of the blogs that the tool has said it removed (deleted-files.txt) are still very much in the repository. I really don't want to start messing with git filter-branch so any help would be appreciated.
I intentionally went with the aggressive --no-blob-protection option to maximize the effect. I've included the commands I ran with the truncated output.
git count-objects -vH
count: 0
size: 0 bytes
in-pack: 1616184
packs: 1
size-pack: 3.38 GiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes
du -rh -d 0
3.4G .
java -jar ~/Downloads/bfg-1.12.12.jar --strip-blobs-bigger-than 2M --no-blob-protection ./
Scanning packfile for large blobs: 1616184
Scanning packfile for large blobs completed in 33,465 ms.
Found 242 blob ids for large blobs - biggest=497179278 smallest=2098032
Total size (unpacked)=3534794122
Found 0 objects to protect
Found 4965 tag-pointing refs : ...
Found 8519 commit-pointing refs : ...
Protected commits
You're not protecting any commits, which means the BFG will modify the contents of even *current* commits.
This isn't recommended - ideally, if your current commits are dirty, you should fix up your working copy and commit that, check that your build still works, and only then run the BFG to clean up your history.
Found 110364 commits
Cleaning commits: 100% (110364/110364)
Cleaning commits completed in 345,977 ms.
Updating 13483 Refs
Ref Before After
Updating references: 100% (13483/13483)
...Ref update completed in 15,354 ms.
Commit Tree-Dirt History
Earliest Latest
| |
D = dirty commits (file tree fixed)
m = modified commits (commit message or parents changed)
. = clean commits (no changes to file tree)
Before After
First modified commit | 757f8383 | c11fc923
Last dirty commit | e28d047b | 92b88b05
Deleted files
In total, 418853 object ids were changed. Full details are logged here:
git count-objects -vH
count: 419093
size: 1.62 GiB
in-pack: 1616184
packs: 1
size-pack: 3.38 GiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes
du -rh -d 0
5.1G .
git reflog expire --expire=now --all && git gc --prune=now --aggressive
Counting objects: 1905870, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (1786570/1786570), done.
Writing objects: 100% (1905870/1905870), done.
Total 1905870 (delta 1274991), reused 482300 (delta 0)
Removing duplicate objects: 100% (256/256), done.
Checking connectivity: 1905870, done.
git count-objects -vH
count: 0
size: 0 bytes
in-pack: 1905870
packs: 1
size-pack: 3.03 GiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes
head ..bfg-report/2016-04-18/10-24-49/deleted-files.txt
8afa72875d3013620bb122916bd1ec33a066cbf2 1075353 file_name1.gpx
7656f6464c67f92c48cdbb03ec5a81067c636238 1644202 file_name2.csv
ab68fb197d4479b3b6dec6e85bd5cbaf433a87c5 773236 file_name3.ttf
86c9c0b55ff99c3789bb3ed17daf51bebacba1cb 870631 file_name4@2x.png
70c928943feab0a3a1f97b4f752e9dbc1d8f37fa 950305 file_name5@2x.png
3862d0da43f5902c75e86ff0dd925d8cca601de3 779356 file_name6@2x.png
6effce4b245961cb46e2cf3f4d05bd6c8c182760 908017 file_name7@2x.png
1866b1053dd48fc4d0677f03feb4baf2f67b567c 1353732 file_name8.gif
f0d984f00678504fe073110bb6553049e9678755 1350785 file_name9.gif
af877d286b12b9f79560a938375abe04a15ff405 3214192 file_name10.gif
git cat-file -s 8afa72875d3013620bb122916bd1ec33a066cbf2
an alternative to git-filter-branch The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history: Removing Crazy Big Files. Removing Passwords, Credentials & other Private data.
I've figured out the problem. We had a lot of old branches that still pointed to trees with large blobs. Deleting these and rerunning bfg gave me a multi gigabyte reduction.
I had thought that the --no-blob-protection flag would have addressed this state.
I found that rerunning the bfg with the same command arguments multiple times kept having it find more commits to clean. Eventually it said
BFG aborting: No refs to update - no dirty commits found??
At that point, reflog expire
and gc
reduced the pack size.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With