Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git: get commits between two commits on a specific branch

Tags:

git

I've read a number of answers for similar scenarios, but not this precise one.

I need to generate a list of commits that occurred between two commits on a specific branch on git. So any commits made to other branches should be ignored. Any idea how to do this?

like image 541
benjamin.keen Avatar asked Dec 31 '15 04:12

benjamin.keen


2 Answers

The notion of "between" is a little fuzzy here, and the term "branch" is not well defined either (in general with git, that is). Let me see if I can keep this short despite these issues (tough for me :-) ):

I'm going to assume that by "between", you mean the graph-theoretic notion of a path within the git commit graph. For instance, suppose we have this graph fragment (I can't draw arrows for the arcs between nodes, but pretend each -, /, or \ has an arrow-head on the left so that commits point straight left, down-and-left, or up-and-left, to their predecessor(s)):

              o - o
            /       \
... - A - o - o - o - B   <-- branch-x
            \   /
              o - o - C   <-- branch-y

I've identified three specific commits—A, B, and C—and provided two branch-tip names branch-x and branch-y pointing to commits B and C respectively. For git-usefulness-purposes let's also assume there's a tag A pointing to commit A, so that we don't have to spell out its SHA-1.

There are three possible paths by which you may reach commit A from commit B. One starts at B, goes up to the top line, descends back to the middle, and then moves left to A. One starts at B, goes directly left, and eventually reaches A. The last starts at B, goes back one step, goes down and left, up and left, and left once more to reach A.

Git gives you the special two-dot syntax, A..branch-x (or substitute in the actual SHA-1 for node A if you don't have a tag named A), which for most commands, indicates that they should visit all the nodes on all possible paths from B back to A, normally including node B itself and excluding node A. This is almost what you want but not quite, because you want to exclude commits that were made on other branches.

This brings up the unanswerable (in general) question "which commits were made on which branch?"

Git tries to tell you that this question is invalid: that you should not care; that you should only care that all of those commits are reachable from node B. Git is actually usually right about this, but "usually" is not "always". Unfortunately, I haven't found any good ways to describe when you should (or do) care (actual good examples would be helpful for a text I'm attempting to write).

Meanwhile, let's press on. It seems clear, from the diagram above, that the commits on the lower line were all "made on branch-y". There is a problem here, because it seems clear, but it may not actually be true. Consider what happens if we re-draw the graph thus:

              o - o
            /       \
... - A - o - o - o - B   <-- branch-x
            \   /
              o
                \
                  o - C   <-- branch-y

This time it seems likely that branch-y was created just to hold the two lower-most commits. (It's even more likely if there is another branch name pointing to the lone third-row commit, although since your original problem statement said to exclude all other branches—not just branch branch-y—that wouldn't really matter in this case.)

Anyway, while I don't quite know what you mean by "branch", nor precisely which commits you want given this graph, let's take a look at the selectors git actually offers. There's one important one that might be exactly what you mean.

I mentioned earlier that "most" git commands use the same syntax specifiers. In fact, most git commands either contain the code from, or simply run, the git rev-list program, whose job is to select objects (usually commit objects) to obtain the list of commit IDs that you want to work with. It's also the command you will want for any kind of scripting.

The rev-list command has a dizzying number of options, many to assist in various kinds of graph traversals. The two most interesting here, I think, are --first-parent and --not.

Using --first-parent

Let's consider --first-parent first. Examine the graph above (either layout: they may look different but topologically, they're identical). Note that it's at merge commits, like node B itself and the node one step to the left of B, that paths fork. This is because it's only merge commits that have multiple outgoing arcs (this is in fact the definition of a merge commit: it's a node with two or more parents).

When git makes a merge commit, it numbers the individual outgoing arcs to the multiple parents. The first arc is special: it's the current branch at the time of the commit. That is, when you did git merge <sha-1-or-equivalent>, you were on some branch at that time,1 and the SHA-1 of the then-current commit becomes the "first parent" of the new merge commit. The additional parents (the merged-in IDs, usually just one but git allows more) are the second, third, and so on.

Using the --first-parent flag tells git to traverse only the first-parent arcs. So git rev-list --first-parent branch-x will start with commit B, then find its first parent (we can't tell which one is first from our diagrams above), follow that's first (and only) parent, and so on, all the way back to a root commit.

This may or may not be what you want (though it doesn't help with the notion of "between").

Using --not

Now let's look at the --not flag.2 Normally, git rev-list <SHA-1-ID-or-name> produces the set of all commits reachable from the given SHA-1 (resolving a name to an ID first as needed). That is, it follows all paths back to all roots. The result is a set of SHA-1 IDs. Using --not makes rev-list exclude these IDs. By itself, this negated-set is not useful, but when combined with a normal (non-negated) set, it is useful. In fact, it's how A..B works in the first place: rev-list first generates the set of all commits reachable from B, then subtracts away the set of all commits reachable from A.

Thus, depending on what you mean by "exclude all commits on other branches", it may be the case that what you want is:

git rev-list branch1 --not branch2 branch3 ... branchN

where you simply list every branch other than branch1 after the --not.

If we look at our diagram one last time, let's see which commits are selected by branch-x --not branch-y:

              o - o
            /       \
... - A - o - o - o - B   <-- branch-x
            \   /
              o - o - C   <-- branch-y

Commit C is obviously reachable from branch-y, as are all the commits on the lowermost line. The commit just to the right of A is reachable as well, as is commit A itself and all earlier commits. The remaining commits are not reachable from branch-y, but are reachable from branch-x commit B, so the resulting graph is:

              o - o
            /       \
            - o - o - B   <-- branch-x

Note that rev-list has --boundary to include the "snip points" (if I can call them that); adding --boundary puts back the node just after A in the original diagram (but A itself remains snipped-away).

(Based on your revised question, --not is probably what you want here, and you just need to get a list of all branches, for which git for-each-ref --format '%(refname:short)' refs/heads is the proper scripting command. Separate out the one branch whose nodes you want kept, put the rest behind --not, and run git rev-list.)


1This is effectively true even if you were on an anonymous branch (in "detached HEAD" mode, in other words). Some git commands will say that you're not on any branch, but you're still working with the same git internals that build branches. Your current branch simply has no name, in this case.

2Technically --not just flips a bit that marks subsequent SHA-1-or-identifier arguments as being negated. If they already have a prefix ^ symbol, they become "positive" references, otherwise they become negative references. Hence x ^y z means "yes x, no y, yes z" while x --not y z means "yes x, no y, no z" and x --not y ^z means "yes x, no y, yes z", for instance.

like image 124
torek Avatar answered Oct 22 '22 02:10

torek


You can easily find a list of commit with the following command,

git log branch_name commit_x..commit_y

For example,

git log dev HEAD~20..HEAD~10 will show you the list of 10th to 20th commits for branch dev.

You can also filter whatever you want by parameters of git log.

You can also store this logs into a file using git log dev HEAD~20..HEAD~10 >> logs.txt

like image 20
Sazzad Hissain Khan Avatar answered Oct 22 '22 01:10

Sazzad Hissain Khan