Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find first common child of two commits

           :
           A
T         / \
i        B   C
m        :   :
e        D   E
          \ /
|          F
V          :

git merge-base B E allows to find where a the common ancestor A of the two commits. Is there a way to find the commit F where the two branches are merged again?

like image 641
bara Avatar asked Jun 06 '12 21:06

bara


People also ask

How does git find common ancestor?

DESCRIPTION. git merge-base finds best common ancestor(s) between two commits to use in a three-way merge. One common ancestor is better than another common ancestor if the latter is an ancestor of the former. A common ancestor that does not have any better common ancestor is a best common ancestor, i.e. a merge base.

How do I know if a commit is ancestor of another commit?

Use git merge-base --is-ancestor <commit1> <commit2> Check if the first <commit> is an ancestor of the second <commit> , and exit with status 0 if true, or with status 1 if not. Errors are signaled by a non-zero status that is not 1.

What is git merge?

Merging is Git's way of putting a forked history back together again. The git merge command lets you take the independent lines of development created by git branch and integrate them into a single branch. Note that all of the commands presented below merge into the current branch.

What type of merge creates new merge commit?

Explicit merges are the default merge type. The 'explicit' part is that they create a new merge commit.


1 Answers

There isn't necessarily a unique answer to this problem, so you have to decide on a few constraints and/or heuristics, or accept the possibility more than one "downstream" merge. The heart of the problem is the same as the problem of multiple merge base candidates—use git merge-base --all to list them all, otherwise Git just picks whichever one pops up first in its algorithm. We can do the same, or find all best merge candidates.

You've drawn what I usually prefer to render sideways as, e.g.:

  B--...--D
 /         \
A           F--G--H   <-- branch1
 \         /
  C--...--E   <-- branch2

but we might have this:

  B--C---D--E--...   <-- branch1
 /    \ /
A      X
 \    / \
  F--G---H--I--...   <-- branch2

In this case both merges D and H are equally good candidates for "the place where the branches re-merge" if you allow both branch1 and branch2 to be considered. Even if you don't, if branch2 merges back into branch1 later:

  B--C---D--E---J--...   <-- branch1
 /    \ /      /
A      X      /
 \    / \    /
  F--G---H--I--...   <-- branch2

then just starting from (or ending at) branch1, both D and H are equally good candidates.

In any case, what we need here is to enumerate commits that end in one or all of the branches you want to consider. To do that, we can use, e.g.:

git rev-list --ancestry-path ^B ^E branch1 branch2

This finds commits that are ancestors of branch1 or branch2, and are also descendants of commit B or of commit E.

To really get the right answer, we want to add --children. That way we'll get the hash ID of each commit, along with the children of that commit that go in this same direction. Git achieves the --children by reversing the backwards connections from the children to the parents as it traverses the links, which is good enough; but we won't see commits B or E. This is kind of a problem. To get them shown, we can add --boundary. This is not ideal: --boundary sometimes includes some commits we don't want. Fortunately, they're all marked with - so we can exclude extra boundary commits by knocking out ones that aren't the commits we care about.

I'm not going to show any of that, but if you did that, you would now have a list, one entry per line, of each node (vertex) and its edges that connect to its children. You can now ask What is the LCA of the DAG formed by these (V,E) sets?

It would be nice if we could just use Git's LCA algorithm, but Git does not have a way to invoke it on arbitrary graphs—we can only invoke it on commits, and the actual commits have parents, not children. So you will have to write your own. See Algorithm to find lowest common ancestor in directed acyclic graph? (which, unfortunately, has no accepted answer). This algorithm looks correct at first blush; it has one of the two standard definitions for LCA in a graph.

If we're willing to settle for a not-nearly-as-good answer, though, we can get something that's probably sufficient in most cases by adding --topo-order (to make sure parents come out after all their children) and --merges (to omit everything that's not a merge commit). This will get a list of all merges.

I have made here a test repository with a simple case:

$ git log --all --decorate --oneline --graph
* 91fcef6 (HEAD -> master) J
* d1e5905 I
*   5bf18a0 merge
|\  
| * 49b2ba7 (sidebr) D
| * 725e5ea C
| * 36b830d (tag: B) B
* | 198a982 (tag: G) G
* | 216bc01 F
* | e905e59 E
|/  
* 5df9428 initial

So I can now name commits B and G using B and G, and the branch I want for a "move in this direction" is just master. So:

$ git rev-list --topo-order --merges --ancestry-path ^B ^G master
5bf18a0797dfd78107928a9a4095f357cfabe914

The last line here is the merge that's "closest" to the two commits. In this case, that's also the only line, and that's the merge we want.

The flaw here is clear enough once we draw it. Suppose I had a more complex graph, such as:

      I--J
     /    \
    H      M--N
   / \    /    \
  /   K--L      \
 /               \
A                 P--Q  <-- master
 \               /
  \   C--D      /
   \ /    \    /
    B      G--O
     \    /
      E--F

If I now run git rev-list --topo-order --merges --ancestry-path ^B ^H master, I'll enumerate commit P, then both G and M in some order. So the last line will either be commit G or commit M, and while both of these are merges, they don't meet the right criterion: they don't merge B and H. Only commit P does that.

Hence, to check whether you have a right answer—without handling the multiple LCA issue—you should take each of the output lines from this git rev-list command, probably in reverse order (consider adding --reverse), and see if both commits are ancestors of each. "Internal" merges like G and M will have only one commit as an ancestor. To do the is-ancestor test, use git merge-base --is-ancestor:

if git merge-base --is-ancestor $commit1 $mergecommit &&
       git merge-base --is-ancestor $commit2 $mergecommit; then
    ... we've found a correct candidate
else
    ... move on to another candidate
fi
like image 64
torek Avatar answered Oct 05 '22 10:10

torek