I am using bfg to clean my git repo. To get the list of big files to delete, I use this script. However for some files I only want to delete specific versions of them from the repo.
bfg has the option to "strip blobs with the specified Git object ids". When I run the above script, I am given a hash for each object in the list. Given that hash, how can I find out the git object id of that specific object so that I can delete it with bfg?
That script appears to list the git object id already.
If you have a particular commit you are interested to clean, you can use the command line "Which commit has this blob?" to check if a particular object id is part of said commit.
git log --all --pretty=format:%H -- <path> | \
xargs -n1 -I% sh -c "git ls-tree % <path> | \
grep -q <hash> && echo %"
For instance, in my repo seec
:
a255b5c1d469591037e4eacd0d7f4599febf2574 12884 seec.go
a7320d8c0c3c38d1a40c63a873765e31504947ff 12928 seec.go
I want to clean the a7320d8
version of seec.go
;
As seen in BFG commit 12d1b00:
People can get a list of blob-ids using "
git rev-list --all --objects
", then grep to list all files in directories they want to nuke, and pass that to the BFG.
Note: the bi test reads:
val blobIdsFile = Path.createTempFile()
blobIdsFile.writeStrings(badBlobs.map(_.name()),"\n")
run(s"--strip-blobs-with-ids ${blobIdsFile.path}")
Meaning the parameter to -bi
is a file, with the blob id(s) in it.
I can also check what I just got is indeed the blob id by looking for its commit:
vonc@bvonc MINGW64 ~/data/git/seec (master)
$ git log --all --pretty=format:%H -- seec.go | xargs -n1 -I% sh -c "git ls-tree % seec.go|\
grep -q a7320d8 && echo %"
I get: commit c084402
.
Let's see if that commit does actually include the seec.go
revision blob id a7320d8
(using "Git - finding the SHA1 of an individual file in the index").
I can find the blob id of a file from a GitHub commit:
vonc@bvonc MINGW64 ~/data/git/seec (master)
$ (echo -ne "blob $(curl -s https://raw.githubusercontent.com/VonC/seec/c084402/seec.go --stderr -|wc -c)\0"; \
curl -s https://raw.githubusercontent.com/VonC/seec/c084402/seec.go --stderr -) | \
sha1sum | awk '{ print $1 }'
a7320d8c0c3c38d1a40c63a873765e31504947ff
Bingo.
Should I want to strip out seec.go
blob id a7320d8
, I know I can pass to bfg that blob id (in a "blob ids" file).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With