Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Show commits made on branch originally (filter out merged commits)

Tags:

git

merge

filter

In a git repo with several branches, I've merged a feature branch into master and merged back from master into the feature branch. How can I now use git log see only those commits made on the feature branch? I need to filter out those comments made to other branches, even though they've been merged back into this feature branch.

$ git status
On branch some_feature

$ git checkout master
$ git merge some_feature
$ git checkout some_feature
$ git merge master

$ # Lots of coding and days gone by

$ git log --some_magic_here

If it matters, the machine on which I'd like to run git log is not the same machine that was used to code the feature. I need this feature to perform a code review on another developer's work. Filter by author=SomeDev doesn't help as I am reviewing one specific feature branch.

like image 447
dotancohen Avatar asked Jan 12 '15 11:01

dotancohen


1 Answers

TL;DR: you want --first-parent (see git rev-list documentation) and a few other selection criteria.

Git does not keep track of "on which branch a commit was actually made". For instance, suppose you are on branch B and you do this:

$ git checkout --detach B
[message about "detached head"]
[edit some file here, and git add result]
$ git commit -m message
$ git branch -f B HEAD
$ git checkout B

This will add a new commit to the tip of branch B, even though the commit was made while not on a branch at all. Or, suppose there are two branches B1 and B2 that both point to commit C7:

... <- C5 <- C6 <- C7   <-- B1, B2

If you make a new commit C8 whose parent is C7, then make branch B2 point to C8, that commit is now on branch B2. If you made the new commit while on branch B1 then B1 also points to C8, but if you then use git reset to move B1 to point back to C7, it is no longer possible to tell1 that the commit was originally made on B1 rather than B2.

All that said, if you maintain good consistency when you merge branches, you can perform graph traversals on the commit graph, finding commits "between" particular merges, and then working through one particular parent chain, to detect which branch the commits were probably done-on. For instance, with feature and master as described, suppose that commits are consistently made on feature and then feature is periodically merged into master with --no-ff so as to make sure that each merge-commit on master is a true merge commit, and not merely a fast-forward. Then the graph pattern will look something like this:

... --X--*------*-o---o-----*--*   <-- master
       \       /   \       /  /
        A--B--C--D--E--F--G--H     <-- feature

(where "time" increases as one moves right, and the * nodes are the merge commits made on branch master, with o nodes representing other commits on master).

In this case, you can find commits that were (probably) made on feature by starting with the tip of feature (commit H) and traversing only first-parent commits—these are A through H—and then discarding commits findable by following only first-parent commits on master. This second criterion eliminates the commit marked X above (and any preceding commits). (Note that in the case of merge commit E, its first parent is D, and its second parent is the o commit above it to the left on master. For all the merges on master, the first parent is on the same line, and the second parent is below and to the left on feature.)

Whether this is sufficient depends on how disciplined you (and others) have been in the commit process. If master has its own internal branch-and-merges, you may have a graph more like this:

... --X---Y---*-----*-o---o---------*--*   <-- master
       \   \ /     /   \           /  /
        \   Z     /     \         /  /
         \       /       \       /  /
          A--B--C--D------E--F--G--H       <-- feature

Now you can't just use the first/not-first parent test quite as easily, since commit Y or Z is necessarily the second parent of a merge on master, but note that Z is not reachable from feature without first traversing E, which is a merge commit "directly on" feature whose second parent commit is on master.

In other words, I use the slightly peculiar phrase "directly on" a branch to identify commits that are find-able by starting at the branch tip, and working backwards through first parent only. For feature these are (still) A through H, plus unfortunately X and its (first) parents.

The git rev-list command, which is one of git's very central commands—it's kind of the big-sister version of git rev-parse which is also very central—has a flag that specifically implements this "directly on" notion, namely, --first-parent. The rev-list command walks2 through some part(s) of the commit graph: you tell it some commit to start from, and it finds that commit's parents, and those parents' parents, and their parents, and so on. If you specify --first-parent it finds only the first parent of each commit. (This obviously affects only merge commits, since by definition a non-merge commit has at most one parent commit.)

In your particular case, you have also said that you wish to eliminate merges into your feature branch, i.e., merge commits that are directly on feature. To do this, simply tell rev-list to exclude merge commits from its output, with --no-merges.

Eliminating commit X and earlier is a bit harder, but as it turns out, not that hard provided you know which branch(es) contain commits like X, that you want to exclude, are on. The --not argument (including several alternative spellings) tells rev-list what commits to exclude. So:

git rev-list --first-parent feature --not master --no-merges

identifies the list of commits "directly on" feature (A through H) but not directly on master (X and earlier), and then lists only those that are not merges (thus eliminating commit E).

You will probably want to pare this down further, e.g., perhaps by using commit time stamps (--since and --until). You may (or may not) want to change the order in which revisions are listed as well, perhaps using --topo-order to put them in graph order rather than date order, for instance.

Putting this together with git log

Fortunately for you, git log effectively invokes git rev-list with your revision specifiers. So instead of having to use git rev-list first, then somehow feeding the results to git log, you can just give these same specifiers to git log. In fact, when figuring out which limiters you want (with things like --since and --until), git log is a lot easier to work with, as a big pile of raw SHA-1s are not very useful to humans.


1Actually, the reflogs, which keep a history of branch motions, do give you the ability to tell, but only on the one local machine; the question said that this must be done on a different system, where the reflogs are not available. In any case the reflog entries expire after a while, 90 days by default for the ones of interest here.

2Unless, of course, you use --no-walk.

like image 177
torek Avatar answered Nov 02 '22 23:11

torek