Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find an unreachable commit hash in a GIT repository by keywords?

Tags:

git

I'm a bit puzzled by a GIT situation.

I'm working on a GIT versioned project and I just noticed that some commits that we thought were already on the master branch weeks ago are actually missing. I remembered these commits were pushed by someone else on a feature branch “feature/something", which does not exist anymore.

I tried to find those missing commits to fix our mistake and to push them on a permanent branch. In this team, each developer puts the ID of the ticket he is working on in the commit message. So I know for sure ticket id (e.g 1234) is in the commit message I’m looking for, so I tried:

git log --all --grep=1234
git log -g --grep=1234
git log --all | grep 1234
git reflog | grep 1234

All of these commands returned nothing.

At this point, I was about to give up and then I remembered our git repo is integrated with Slack, so I searched 1234 in slack history and found the commits hashes. I immediately tried:

git show hash1
git show hash2

which surprisingly worked! It displayed all the commit information. So the commits are there, somehow still on my local repository. So I wanted to double check how I missed them:

git reflog | grep hash1
git branch --contains hash1
git fsck --lost-found | grep hash1

Nothing.

git fsck --unreachable | grep hash1
unreachable commit hash1

And here it is, in the unreachable commits list.

But this is a big project and git fsck --unreachable returns a tons of commits, how could I have found this lost commit by keyword ? If we did not have a third party tool logging the git activity, maybe I would have tried piping the output of git fsck back into git show somehow and grepping on the result but that seems like a lot to do just to find a commit that I know is right here somewhere.

P.S: I’m sorry I can’t share the repo, it’s a private project but the following should reproduce the situation:

User A:

git clone <repo>
git checkout -b feature/something
# add something to commit
git commit -m “special-keyword"
git push origin feature/something

User B:

git clone <repo>
git push origin :feature/something

Now User B works for weeks, and then tries to find the commit "special-keyword" pushed by User A.

like image 955
Joucks Avatar asked Jul 13 '15 16:07

Joucks


People also ask

How can we locate a particular git commit?

Finding a Git commit when given a file size So we use git rev-list to generate a list of all the commits (by default, these are output from newest to oldest). Then we pass each commit to the ls-tree command, and use grep to see if that number appears anywhere in the output.

What is an unreachable commit?

Any commit that cannot be accessed through a branch or tag is considered unreachable.

How does git determine commit hash?

The commit hash by hashing the data you see with cat-file . This includes the tree object hash and commit information like author, time, commit message, and the parent commit hash if it's not the first commit.

How do I find an old commit?

To pull up a list of your commits and their associated hashes, you can run the git log command. To checkout a previous commit, you will use the Git checkout command followed by the commit hash you retrieved from your Git log.


1 Answers

When you delete a branch, you also delete its reflog. There's a separate reflog for HEAD that will retain a reference to commits that were on the deleted branch, but only if you've had them checked-out.

The difference between --lost-found and --unreachable is subtle:1 see the git glossary, and/or the illustration below. In general, using --lost-found and/or --unreachable will find such commit(s) (and with --lost-found, also write IDs into the .git/lost-found/commit directory, which I think has the side effect of protecting them from garbage collection).

In this particular case, the commit you were looking for was not the tip-most commit of the deleted branch. That is, suppose before deleting feature/something we have this, with the two most recent commits made on the feature branch:

A <- B <- C   <-- master
  \
    D <- E    <-- feature/something

Now we delete feature/something, losing the IDs of commits E and D both. Both IDs will show up in the output of git fsck --unreachable, but only E's ID will show up (and be saved) by git fsck --lost-found, because commit D is "reachable" from E if/when you restore that commit.

Finding your commit

how could I have found this lost commit by keyword?

It's a bit tricky. Probably your best bet is using git show on all unreachable commits, something like:

git show $(git fsck --unreachable | git cat-file --batch-check |
    awk '/commit/ { print $3 }')

Now you can search for the keyword(s) in the log messages (or the diffs). The internal $(...) sequence is the method for extracting all the candidate IDs: we just want commits, not tags, trees, and blobs. Once you have the IDs, all regular git commands (git log -1, git show, etc) can be used with those.


1In fact, I just learned it myself writing up this answer.

like image 178
torek Avatar answered Sep 20 '22 23:09

torek