Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to recover a commit that was accidentally skipped during a rebase?

Tags:

git

git-rebase

When it happens that a useful commit is accidentally skipped during a rebase operation, is there any hope that Git keeps a reference of it that could be reapplied?

It was a non-interactive rebase with lots of binary files where I went too long into a happy-trigger mood using git rebase --skip, so there were no error messages at all, just a lousy attitude.

This seems a hard-disk crashing recovery scenario, but instead of chasing phantom inodes, there should be a way to filter lost tree objects inside .git/objects and getting them back alive.

like image 976
milton Avatar asked Aug 08 '14 21:08

milton


2 Answers

When you run git rebase (interactive or not), git basically does a series of cherry-pick operations to copy your original commit-chain to a new chain. Let's use o for the original commits, and draw the commit-graph fragment for branch branch coming off branch main:

        o1 - o2 - o3 - o4   <-- branch
      /
..- * - x                   <-- main

Now you might run git rebase to copy all the old o commits to new n commits, but based off x, the tip of main, rather than based off *, the old merge-base point. To make it even more like what happened, let's "accidentally" leave one out:

        o1 - o2 - o3 - o4   <-- ???
      /
..- * - x                   <-- main
          \
            n1 - n3 - n4    <-- branch

The ??? label above represents the git reference (branch-name, tag-name, or any other suitable label) that points or pointed to commit o4. All your old commits are still in there as long as there's a name pointing to them. If there's no name, they still stick around until git gc cleans them out (but you don't want that to happen so don't run git gc :-) ).

The important question, then, is: "what name or names can we (and git) use to find o4?" It turns out there are at least two:

  • one or more in a "reflog", and
  • one spelled ORIG_HEAD.

The ORIG_HEAD one is the easiest to use, but that name is also used by other commands (git merge, for instance) so you have to see if it's still correct:

$ git log ORIG_HEAD

If that gives you the right chain, give yourself a more permanent name pointing to commit o4. This can be a branch name (you thus "resurrect" the old branch under a new name), or a tag name, or indeed any other name but branch and tag are the easy ones:

$ git branch zombie ORIG_HEAD

(You don't have to do this, and as you get more comfortable with git you can skip this step, but it's probably good to do until then.)


What if ORIG_HEAD has been whacked (e.g., by another rebase, or merge, or whatever)? Well, then there are reflogs.

There's one reflog for HEAD, and by default, another reflog for each branch-name. In this case the one to use would be the reflog for branch:

$ git reflog branch
$ git log -g branch

but you can just use git reflog to show the one for HEAD (this one is noisier, which is why looking at the one just for branch might be better):

$ git reflog
$ git log -g

Somewhere in all that output, you should be able to find commit o4. You might find lots of other commits that resemble o4, which is why git log -g can be helpful as it will let you find the real (or correct) o4.

In any case, assuming you eventually come up with a reflog style "relative name" (like branch@{1} or branch@{yesterday}), you can find the raw SHA-1, or use that relative name, to once again resurrect the zombie version of branch:

$ git branch zombie branch@{yesterday}

or:

$ git branch zombie feedd0gf00d

or whatever.


All this does is give you a name, zombie, where there were three question-marks in the drawing of the graph. You still have to use that to find the dropped commit, in this case commit o2. You can find it by raw SHA-1 (by reading through git log) and re-rebase and pull that one in, or cherry-pick it to append a copy to n4, or whatever.

If all you want to do is set branch back to commit o4, you can even dispense with the zombie branch entirely, and just do a git reset --hard while on branch branch:

$ git checkout branch           # if needed
$ git reset --hard feedd0gf00d

or:

$ git reset --hard ORIG_HEAD

Note that the thing after reset --hard is just any commit-ID. The --hard makes reset wipe out your work-tree and replace it with the target commit, while the reset action itself tells git: "make the current branch point to the commit-ID I'm about to give you, regardless of whatever branch-tip-commit it names right now."

In other words, after your git rebase finishes and you discover you left out o2 when making the n1 - n3 - n4 chain, if you immediately git reset --hard ORIG_HEAD, git changes this:1

        o1 - o2 - o3 - o4   <-- ORIG_HEAD
      /
..- * - x                   <-- main
          \
            n1 - n3 - n4    <-- HEAD=branch

to this:

        o1 - o2 - o3 - o4   <-- ORIG_HEAD, HEAD=branch
      /
..- * - x                   <-- main
          \
            n1 - n3 - n4    [abandoned]

The [abandoned] chain of n commits is actually still in the repo, of course: there's a name pointing to n4 in the reflogs!

(The reflog entries eventually expire—by default, after 30 to 90 days, depending on details not yet interesting—and once they expire and there is no name by which to find n4 or o4 or whatever, then git gc will clean up and remove them.)


1Note that I've added the HEAD= notation to this graph, to indicate which branch you're on. This HEAD= stuff is actually a pretty good approximation to how git keeps track of which branch you're on. In the .git directory, there's a file named HEAD, and that file simply contains the name of the current branch!2 If you write a new name in the file, git changes its idea of which branch you're on (without changing anything else). That's exactly what git reset --soft does: write a new name into HEAD. (Using --mixed adds a little more action: git reset then updates the index/staging-area; and using --hard adds even more: git reset then wipes out work-directory contents, replacing them with whatever you've had it put into the HEAD file.)

2In "detached HEAD" mode, the file contains the raw SHA-1 of the current commit, instead of the name of the current branch. That, in fact, is the real difference between being "on a branch" and being in "detached HEAD" mode. When git wants to know what the current commit is, it looks at the file HEAD. If it has a raw SHA-1, that's the answer. If it has a branch name, git reads the branch-name to get the raw SHA-1. Those are the only two allowed setups—nothing else should be in the HEAD file.

like image 166
torek Avatar answered Oct 17 '22 08:10

torek


does git reflog work for you? I think it should still be in the garbage collector unless you ran git gc

like image 29
AgileDan Avatar answered Oct 17 '22 07:10

AgileDan