Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to refer to a child commit of the current detached HEAD?

I know how to refer to a parent commit as HEAD^.

But is it possible to refer to a child commit in a similar way?

like image 319
konstunn Avatar asked Aug 27 '16 15:08

konstunn


People also ask

Can you commit from a detached head?

You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch.

How do I fix a detached committed head?

To save changes committed in a detached HEAD state, you first need to create a new branch. Continuing from the scenario described above, you create a new branch called temp-branch . As soon as you make the branch and check out to it, the HEAD is no longer detached.

What does head detached mean?

A “detached HEAD” message in git just means that HEAD (the part of git that tracks what your current working directory should match) is pointing directly to a commit rather than a branch. Any changes that are committed in this state are only remembered as long as you don't switch to a different branch.

Which character append at the end of commit reference point to the parent of that commit?

HEAD~ is always the same as HEAD^ , similarly HEAD~~ is always the same as HEAD^^ , and so on. The caret( ^ ) sign refer to the parent of that particular commit. So, if you place a ^ (caret) at the end of a commit reference, Git resolves it to mean the parent of that commit.


1 Answers

The answer is both no and yes, or perhaps "this question makes no sense", depending on what you mean.

Your question refers specifically to "a child commit of the current detached HEAD". This is why the question makes no sense, at least not without additional information.

As Tim Biegeleisen noted in a comment, when you check out by branch name, this puts you on the tip of the branch. This occurs by definition in Git, because Git is different from most version control systems. In Git, the term "branch" is somewhat ambiguous: we have branch names, which are defined as "labels pointing to a commit that denote that branch's tip", and we have the branch-and-merge structures (what I like to call "DAGlets") within the commit DAG. See What exactly do we mean by "branch"? for more about this. However, a key consequence of all this is that commits are often on more than one branch at a time.

The missing information

When you are pointing to some commit somewhere in the commit DAG, and you ask about child commits, you are making an assumption. Specifically, you are assuming a particular branch or set of branches. To see how this works, let's look at how branches evolve.

Branches in motion

The tip of a branch is, by definition, its end: there are no more commits on that branch. If you run git commit and make a new one, the branch name automatically points to the new commit you just made. The branch now has a new tip-most commit, and—here's the real kicker—you're now at that commit, which of course has no children: it's a new childless commit. It can only acquire new children by having new commits added to it.

This makes sense visually: here we are at the tip of some branch, commit D in branch master:

                        HEAD
                         |
                         v
A <- B <- C <- D   <-- master

We are on branch master (HEAD points to master) and master points to commit D.

The act of adding a new commit—doing a successful git commit—means we create a new commit E, whose parent is D, and Git immediately makes master point to E as well:

                             HEAD
                              |
                              v
A <- B <- C <- D <- E   <-- master

The internal branch arrows within these graphs always point left-ward (backwards in time) so I'm going to start drawing the commits without internal arrows here. Let's do that now for this last graph:

A--B--C--D--E   <- (HEAD) master

Now, all this stuff about children remains true even if we decide we need to make a new branch that grows from commit D. Suppose at this point we do:

git checkout -b feature master~1

What this does is to make a new branch label, feature, pointing directly to commit D. Commit E—the tip of master—is still there, but now HEAD will point to feature and feature will point to D:

           E   <-- master
          /
A--B--C--D     <-- (HEAD) feature

Commits A through D are now on both branches: both master and feature.

If we make a new commit now on feature, we get this:

           E    <-- master
          /
A--B--C--D--F   <-- (HEAD) feature

That is, F is now the tip-most commit of feature: HEAD has not changed at all, and feature now points to F. Commits E and F are on just one branch each, and commits A through D are still on both branches.

If we don't want F after all, we can now:

git checkout master

to switch back to master (and commit E) and then:

git branch -D feature

to delete the feature branch and forget all about commit F.1 And now commits A through D are only on one branch, master.


1In case you change your mind, the commit tends to linger in the repository for a while. Git normally remembers the IDs of "abandoned" commits for at least 30 days via "reflogs". There is a reflog for HEAD and it retains the raw SHA-1 hash of commit F, so that you can use git reflog to look for F and get it back.

There was a reflog for branch-name feature as well, but when we did git branch -D feature, we made Git throw it away.


Anonymous branches in motion

In Git, a "detached HEAD" acts as a sort of unnamed branch. Let's use this same A-through-E as before (without adding F yet), but just use git checkout master~1 instead of git checkout -b feature master~1, and draw that. Now instead of HEAD pointing to feature and having feature point to commit D, what we get is HEAD pointing directly to commit D:

           E   <-- master
          /
A--B--C--D     <-- HEAD

This is what a detached HEAD is: HEAD contains the raw commit hash (the ID, the a123456... thing) of some commit, instead of holding the name of a branch and letting the branch-name point to the tip-most commit.

Nonetheless, in this situation, we can add new commits, just as we did to make commit F before. Since our original F is still actually in there, I'll draw that in too, and use G instead for this new commit:

           E    <-- master
          /
A--B--C--D--G   <-- HEAD
          \
           F    [abandoned]

What all this means is that, just as when you are on a named branch like feature, when commit D is current, it has no children on the current branch (even though there's a commit E on master that has commit D as a parent—and, for that matter, there's an abandoned commit F still in there as a child as well!).

A detached HEAD is just an anonymous branch

In Git, the key difference between being on a branch and being in detached HEAD mode is that when you are on a branch, the HEAD file contains the name of the branch. That is, it "points to" the branch name, and the branch name points to the commit, as in our drawings above. When you have a detached HEAD, the HEAD file points directly to the commit ... which is the branch tip. There is no branch name; the current commit is the tip of the anonymous branch.

Being the tip of a branch, it automatically has no children.

Restoring the missing information

You might well object to all of this: I was on a branch, so my detached HEAD should be considered to be part of the branch I was on before!

That is, in fact, a reasonable objection. But what branch were you on before? All you said is: "I currently have a detached HEAD and would like to find a child commit."

Suppose you rephrase the question this way: "I have a detached HEAD pointing to some commit. I would like to find a child commit of the current commit, when working with branch (branch-name) B."

Now we have enough information! Because of the way Git works, what we have to do is start at the tip of B—the commit to which branch name B points—and work backwards from B until we arrive at the current commit (if ever).2 There are several built-in ways to do this.

The one discussed earlier, in BVengerov's answer, is using git log --children, or equivalently, git rev-list --children (git log and git rev-list are essentially the same command with different output formats). Let's use git rev-list to avoid having to fuss with --pretty=format and --no-patch when we just want to get hash IDs.

As we just noted, the way git rev-list works is to start at some commit(s)—often, the tip of a branch, or many tips of many branches—and work backwards, following the parent pointers. By default, it just prints out each commit's hash ID:

$ git rev-list master
f8f7adce9fc50a11a764d57815602dcb818d1816
8213178c86d5583ff809c582d6727ad17b6a0bed
[snip]

With --parents it prints each commit and its parent IDs (and to avoid having to [snip] I'll add -n 2 to stop after 2 things are printed):

$ git rev-list --parents -n 2 master
f8f7adce9fc50a11a764d57815602dcb818d1816 8213178c86d5583ff809c582d6727ad17b6a0bed 08df31eeccfe1576971ea4ba42570a424c3cfc41
8213178c86d5583ff809c582d6727ad17b6a0bed 2a96d39824464c28f2f45f2f4a4d53d7c390c9eb

(the tip of master here is a merge, with two parents, hence the three hashes printed on one very long line).

Using --children tells git rev-list to do something interesting (and technically difficult): instead of printing the parents for each commit, it walks the entire chain that it would have walked, finding the parents, then reversing the parent/child relationships. Remember that we initially drew our graphs like this:

A <- B <- C <- D   <-- master

Commit C knows only about its parent commits, not its children. The rev-list command can walk from the tip we gave it, master, back to root commit A, and then having done so, it can reverse all the arrows it followed (I have boldfaced this phrase for a reason):

A -> B -> C -> D

Having done all that, it can now print commit C's ID followed by commit D's ID, all on one line. If I do that now, in the Git repository for Git itself, I get this:

$ git rev-list --children -n 2 master
f8f7adce9fc50a11a764d57815602dcb818d1816
8213178c86d5583ff809c582d6727ad17b6a0bed f8f7adce9fc50a11a764d57815602dcb818d1816

There's a noticeable pause before the output starts printing, and the reason for that is that git rev-list actually walked through 43807 commits to get this result.3 That's the number of commits on branch master at the moment. We didn't limit the traversal, so Git walked through every commit reachable from master, reversed all the arrows in the resulting walk, and finally, printed two commit hashes with their attached reversed-arrow child IDs: that of master itself (f8f7adc...), and that of the commit "just before" master on the main line (master^1 or 8213178...).


2If the current commit is not, in fact, an ancestor of the tip of branch B—for instance, in our graphs above, if we ask about master when we're on commit F or commit G, neither of which is contained in the master branch—then this git rev-list will never reach the current commit.

3To find this count, I ran:

git rev-list --count master

which simply gives a count of the commits visited in the rev-list walk. Another way is to run:

git rev-list master | wc -l

which lists every commit on standard output, with that output piped to the wc program and wc instructed to count lines. But getting git rev-list to do the counting is roughly twice as fast.


The easy way doesn't quite work

I'm in the Git repository for Git itself, and I did:

$ git checkout master
$ git checkout HEAD^

and now git status says HEAD detached at 8213178. We want to find the (single) child of 8213178, which is the commit master points to. So we try this:

$ git rev-list --children -n 1 HEAD
8213178c86d5583ff809c582d6727ad17b6a0bed

Well, that was a bust! But what went wrong?

Remember the phrase I bolded earlier: git rev-list will reverse all the arrows it followed. Alas, it followed no arrows at all to get to HEAD! It started at HEAD (8213178...). That was the only revision it needed (-n 1) so it stopped there too, and printed HEAD's ID and finished.

Using -n 2 makes it at least follow one arrow—one parent link—but it's not really helpful:

$ git rev-list --children -n 2 HEAD
8213178c86d5583ff809c582d6727ad17b6a0bed
2a96d39824464c28f2f45f2f4a4d53d7c390c9eb 8213178c86d5583ff809c582d6727ad17b6a0bed

This time, it started at HEAD again and followed an arrow to HEAD^ (2a96d39...), so it was able to reverse that arrow: it told us that 8213178... is a child of 2a96d39.... But we want to know: what nodes have 8213178... as a parent? And to get Git to discover that, we have to start somewhere beyond 8213178....

There's just one right place(s) to start

The place to start is the tip of master. We know this because we're interested in "children of HEAD that are on the road that leads to the tip of master".

If we wanted, we could ask instead of "children of HEAD that are on the road that leads to the tip of next", or "children of HEAD that are on the road that leads to the tip of pu", or of any other branch. Or we could even ask for children on any branch:

git rev-list --children [other options] --branches

The --branches flag means "all branches"; --branches=abc* means "all branches whose name starts with abc"; and so on.

The point here is that we must tell Git where to start. We cannot start at HEAD. We can stop there, but we cannot start there. Stopping at HEAD can speed things up—there's no need to look at over 43 thousand commits—so we might try HEAD..master:

$ git rev-list --children HEAD..master
f8f7adce9fc50a11a764d57815602dcb818d1816
08df31eeccfe1576971ea4ba42570a424c3cfc41 f8f7adce9fc50a11a764d57815602dcb818d1816
1ecc6b291c162b9fc7b59a3251c4cbbcf3b07b84 08df31eeccfe1576971ea4ba42570a424c3cfc41
6cbec0da471590a2b3de1b98795ba20f274d53fa 1ecc6b291c162b9fc7b59a3251c4cbbcf3b07b84
8e4571e57a1a3cc6f1318b3da8612b2e3c8e1252 6cbec0da471590a2b3de1b98795ba20f274d53fa
c81d2836753a268be07346d362ffab3c6a5e14a9 8e4571e57a1a3cc6f1318b3da8612b2e3c8e1252
[12 more lines snipped]

Whoa, what happened here? The answer is a bit tricky but has to do with the fact that master is a merge commit. Let's look at this, somewhat simpler, expression first:

$ git rev-list --count HEAD..master
18

Even though HEAD is just master^1, i.e., the first parent of master, there are in fact 18 reachable commits in HEAD..master. This is because there are 17 commits on master^2 that are not also on master^1. Add the merge itself and you get these 18 commits. Drawing it accurately is tough in general because Git's commit DAG is very messy, but a simplified picture looks something like this "HEAD in a bathtub":

                       HEAD
                        |
                        v
...--x--x---------------x--o   <-- master
         \                /
          o--o--o--o--o--o

The expression HEAD..master means that Git should start crossing out (x-ing) commits from HEAD on back, while taking commits from master on back. So this takes master itself, and tries to take master^1 (the HEAD commit) but that gets x-ed out, and tries (and succeeds) to take master^2 and then all the commits along the bottom row, stopping once it rejoins the top row where the commits get x-ed out.

What works

The trick here is that we must get git rev-list to examine the HEAD commit itself, so that it follows any arrows that lead to HEAD. Then we can use grep or similar to pick out the line that has the HEAD commit, and use --children so that that line lists the commits that have HEAD as their parent:

$ git rev-list --children master | grep "^$(git rev-parse HEAD)"

This walks all 43-thousand-plus commits, finding everything we care about and lots of things we don't, and then extracts the one line that starts with the commit we do care about, which is the one starting with the current commit ID (grep "^$(git rev-parse HEAD)"—the hat character here is grep's notation for "beginning of line", and thus has nothing to do with Git itself).

We can speed this up a little bit by terminating the walk at any parent of HEAD:

git rev-list --children master ^HEAD^@

It's tempting to try to combine this with -n and/or --reverse, but this is doomed to fail for two reasons:

  • The HEAD commit can be anywhere in this traversal depending on the DAGlet structure that follows HEAD
  • The -n limiting is done before reversing the list, so that -n 1 always just gets you the master commit anyway.

So we could stop here and declare victory, using git rev-list --children to reverse the arrows, using master to get the selection to start at the right place, optionally using HEAD^ or HEAD^@ to stop the traversal to speed up the git rev-list walk, and—crucially—using grep to pick out the desired line, and then view all the child commit IDs.

But this is Git, so there is another way

The rev-list command also supports a flag, --ancestry-path, that does just what we need here.

Normally, as I noted above, git rev-list X..Y "means" git rev-list Y ^X: find all commits reachable from commit Y, excluding all commits reachable from commit X. If there are no merges in the DAGlet selected by X..Y, this list—if printed in the correct order, at least—ends or begins with the first commit after commit X. The problem occurs when there are merges: the ^X part tosses out commit X and its ancestors, but fails to toss out the ancestors of Y that are not descendants of X. Look at the "HEAD in a bathtub" graph again, though this time I will add a few more commits on the other side of the bathtub, and in fact, make another branch-and-merge in it:

                       HEAD
                        |
                        v
...--x--x---------------x--o--o---o   <-- master
         \                /    \ /
          o--o--o--o--o--o      o

What we want is to "x out" commits that are not to the right of (descendants of) the "you are here" point. That is, we want this instead:

                       HEAD
                        |
                        v
...--x--x---------------x--o--o---o   <-- master
         \                /    \ /
          x--x--x--x--x--x      o

All the remaining os are both ancestors of the tip of master and descendants of HEAD. This is exactly what --ancestry-path does: it means "for any commits we explicitly exclude, also exclude commits that do not have those commits as an ancestor". (That is, Git inverts the condition: it cannot test "is descendant of" so easily, but it can test "is ancestor of". If D is a descendant of A, then by definition, A is an ancestor of D, so by testing for "not ancestor-of" it can deduce "not descendant-of".)

If we list the resulting commits in some sort of topologically-sorted order, then pick one "closest to" HEAD, we get a suitable "next commit". Note that sometimes there are two or more such commits. For instance, let's move forward one commit, dragging our HEAD right onto the edge of the bathtub:

                          HEAD
                           |
                           v
...--x--x---------------x--x--o---o   <-- master
         \                /    \ /
          x--x--x--x--x--x      o

Now let's step forward again:

                             HEAD
                              |
                              v
...--x--x---------------x--x--x---o   <-- master
         \                /    \ /
          x--x--x--x--x--x      o

Which of the two remaining commits should we visit? Suppose we pick the "topmost" one. Will we ever visit the lower one? We could try to pick the lower one, which will make us visit them all. (I am not going to suggest a method for this.)

Now consider this DAGlet, which I think resembles a benzene ring or phenyl group:

          o--o
         /    \
...--x--x      o   <-- branch
         \    /
          o--o

If we move to the top row, how will we ever revisit the bottom row? If we move to the bottom row, how will we ever revisit the top row?

The really-right way

The only real solution to the complete problem4 is to mark up the commits to visit before leaving the named-branch-tip. That is, if your goal is to visit every commit (or some reasonably well defined subset of "every commit") in some range of commits from "where I am now" to "some point in the past", you should start by marking out the entire range. An easy way to do that is to run git rev-list to get a list of every commit ID in that range (using --boundary, the ^X^@ syntax, or an explicit addition of the starting-point X if you want to include commit X in a range like X..Y). You then have, probably saved in a file, the IDs of every commit to visit, so you won't miss some when traversing a "benzene ring" DAGlet.

Alternatively, you can mark two commits and then work between them. This is how git bisect works, for instance.


4Well, obviously this depends on just how you constrain the problem. Which is why defining what you want to do is so important!

like image 67
torek Avatar answered Oct 03 '22 08:10

torek