Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does git log's ancestry path work with multiple negated commits?

Tags:

git

The behavior of git's --ancestry-path option for git log or git rev-list is relatively straightforward when there is only one negated commit - that is, if you have:

git rev-list ^A B
# or it's shorthand form:
git rev-list A..B

...then the result is all commits that are descendents of A and ancestors of B (for a fuller explanation see here).

However, I wish to know how --ancestry-path behaves if you have TWO (or more) negated revisions - ie:

git rev-list ^C ^A B

(Note - I will refer to non-negated revisions, such as B above, as "postive" commits, and all negated revisions as "negative" commits.)

After inspecting the results in some simple test cases, the "descendents" restriction imposed by --ancestry-path for negative commits seemed to be OR'd / union'ed together - that is, git seemed to be doing:

  • first, find all commits that are ancestors of at least one positive commit
  • then, filter down to only commits that are descendents of at least one negative commit
  • then, filter out commits that are ancestors of at least one negative commit

Or, in more set-like notation:

IntersectionOf(
    Union(ancestors(positive1), ...ancestors(positiveN))
       AND
    Union(descendents(negative1), ...descendents(negativeN))
)
- Union(ancestors(negative1), ...ancestors(negativeN))

However, in some more complex scenarios, this proved to incorrect - there were more commits excluded in the final result than the above logic would imply.

For example, suppose you have this full graph:

ancestors(MA3)

      -----MA3
     /     |  
    /      MB3
   A3     /|  
  /|    B3 |  
 / |   /|  |  
R3-+--- |  |  
|  |    |  |  
|  |  --+--MA2
|  | /  |  |  
|  |/   |  MB2
|  A2   | /|  
| /|    B2 |  
|/ |   /|  |  
R2-+--- |  |  
|  |    |  |  
|  |  --+--M1 
|  | /  |  /  
|  A1   | /   
| /     B1    
|/     /      
R1-----       

This represents a common root branch (R), which both feature branches A and B inherit from, and A and B are in turn merged into a common merge branch (M). (And here's a bash script to set up a repo with this structure!)

We wish to find:

--ancestry-path ^R2 ^B3 MA3

If we do

Union(descendents(R2),      Union(ancestors(R2),
      descendents(B3))            ancestors(B3))

      -----MA3                                 
     /     |                                   
    /      MB3                                 
   A3     /|                                   
  /|    B3 |                             B3    
 / |   /|  |                            /|     
R3-+--- |  |                     R3----- |     
   |    |  |                     |       |     
   |  --+--MA2                   |       |     
   | /  |  |                     |       |     
   |/   |  MB2      MINUS        |       |     
   A2   | /                      |       |     
        B2                       |       B2    
                                 |      /|     
                                 R2----- |     
                                 |       |     
                                 |       |     
                                 |       |     
                                 |       |     
                                 |       B1    
                                 |      /      
                                 R1-----        

then we expect:

      -----MA3
     /     |  
    /      MB3
   A3      |  
   |       |  
   |       |  
   |       |  
   |       |  
   |  -----MA2
   | /     |  
   |/      MB2
   A2         

However, what we actually get from

git rev-list --ancestry-path ^R2 ^B3 MA3 | xargs -i git tag --points-at '{}'

is:

MA3
MB3
A3
MA2
A2

ie:

      -----MA3
     /     |  
    /      MB3
   A3      |  
   |       |  
   |       |  
   |       |  
   |       |  
   |  -----MA2
   | /        
   |/         
   A2         

...which is the same, except MB2 is excluded.

So, clearly, my guess at how --ancestry-path works with multiple ^ negations is wrong. Can anyone explain the logic git actually uses, and why MB2 is excluded?

EDIT So, just to visually clarify the answer from jthill below (https://stackoverflow.com/a/65450624/920545), he's saying it works like this:

  • First, git calculates ancestors(MA3) - Union(ancestors(R2), ancestors(B3) - so we end up with:
ancestors(MA3) - Union(ancestors(R2),
                       ancestors(B3))

      -----MA3
     /     |  
    /      MB3
   A3      |  
   |       |  
   |       |  
   |       |  
   |       |  
   |  -----MA2
   | /     |  
   |/      MB2
   A2      |  
   |       |  
   |       |  
   +       |  
   |       |  
   |  -----M1 
   | /        
   A1         

It then tries to find all nodes along paths from R2 or B3 to MA3, USING THIS LIMITED SET OF NODES. So, it tries to find all paths from (x) to [y] on this graph:

        ----[MA3]
       /     |  
      /      MB3
     A3     /|  
     |  (B3) |  
     |       |  
     |       |  
     |       |  
     |  -----MA2
     | /     |  
     |/      MB2
     A2      |  
    /|       |  
   / |       |  
(R2) |       |  
     |       |  
     |  -----M1 
     | /        
     A1         

...which is why we end up with the result I showed above. Thank you jthill!

like image 811
Paul Molodowitch Avatar asked Dec 11 '20 22:12

Paul Molodowitch


1 Answers

MB2 is on an ancestry path to R2, as its name suggests, but that ancestry path steps outside the bounds you set when you excluded ancestors of B3 from your walk.

Git's doing what you asked for: take MA3 and all its ancestors, exclude B3 and R2 and all their ancestors, and then try to trace an ancestry path to the excluded tips.

The effect that gets often produces the same results as all-descendants-minus-all-ancestors tracing, but you've found a case where the difference matters. Git traces ancestry, not descent.

like image 187
jthill Avatar answered Nov 15 '22 20:11

jthill