Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the git object id of an object with a known hash

I am using bfg to clean my git repo. To get the list of big files to delete, I use this script. However for some files I only want to delete specific versions of them from the repo.

bfg has the option to "strip blobs with the specified Git object ids". When I run the above script, I am given a hash for each object in the list. Given that hash, how can I find out the git object id of that specific object so that I can delete it with bfg?

like image 543
Chin Avatar asked Jul 01 '17 05:07

Chin


1 Answers

That script appears to list the git object id already.

If you have a particular commit you are interested to clean, you can use the command line "Which commit has this blob?" to check if a particular object id is part of said commit.

git log --all --pretty=format:%H -- <path> | \
 xargs -n1 -I% sh -c "git ls-tree % <path> | \
 grep -q <hash> && echo %"

For instance, in my repo seec:

a255b5c1d469591037e4eacd0d7f4599febf2574 12884 seec.go
a7320d8c0c3c38d1a40c63a873765e31504947ff 12928 seec.go

I want to clean the a7320d8 version of seec.go;

As seen in BFG commit 12d1b00:

People can get a list of blob-ids using "git rev-list --all --objects", then grep to list all files in directories they want to nuke, and pass that to the BFG.

Note: the bi test reads:

val blobIdsFile = Path.createTempFile()
blobIdsFile.writeStrings(badBlobs.map(_.name()),"\n")
run(s"--strip-blobs-with-ids ${blobIdsFile.path}")

Meaning the parameter to -bi is a file, with the blob id(s) in it.


I can also check what I just got is indeed the blob id by looking for its commit:

vonc@bvonc MINGW64 ~/data/git/seec (master)
$ git log --all --pretty=format:%H -- seec.go | xargs -n1 -I% sh -c "git ls-tree % seec.go|\
grep -q a7320d8 && echo %"

I get: commit c084402.

Let's see if that commit does actually include the seec.go revision blob id a7320d8 (using "Git - finding the SHA1 of an individual file in the index").
I can find the blob id of a file from a GitHub commit:

vonc@bvonc MINGW64 ~/data/git/seec (master)
$ (echo -ne "blob $(curl -s https://raw.githubusercontent.com/VonC/seec/c084402/seec.go --stderr -|wc -c)\0"; \
   curl -s https://raw.githubusercontent.com/VonC/seec/c084402/seec.go --stderr -) | \
  sha1sum | awk '{ print $1 }'
a7320d8c0c3c38d1a40c63a873765e31504947ff

Bingo.

Should I want to strip out seec.go blob id a7320d8, I know I can pass to bfg that blob id (in a "blob ids" file).

like image 155
VonC Avatar answered Oct 11 '22 13:10

VonC