Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git: cumulative diff with commit-limiting

Tags:

git

diff

git log has some very useful commit-limiting options, such as --no-merges and --first-parent. I'd like to be able to use these options when generating a cumulative diff patch/stat/numstat for a range of commits.

With these commands:

git log --oneline --first-parent --no-merges --patch   29665b0..0b76a27
git log --oneline --first-parent --no-merges --stat    29665b0..0b76a27
git log --oneline --first-parent --no-merges --numstat 29665b0..0b76a27

the diff is not cumulative (the changes are listed individually for each commit).

With these commands:

git diff --patch   29665b0..0b76a27
git diff --stat    29665b0..0b76a27
git diff --numstat 29665b0..0b76a27

the diff is cumulative, but unfortunately git diff doesn't support the commit-limiting options.

So what I'd like is the cumulative diff functionality of git diff combined with the commit-limiting functionality of git log.

One idea I had is to use git log to generate a list of commit hashes, and then somehow pipe that list to git diff to generate a cumulative diff of the specified commits. Something like this (obviously this method of piping hashes to git diff doesn't actually work):

git log --pretty=format:%h --first-parent --no-merges 29665b0..0b76a27 | git diff

where --pretty=format:%h outputs the hashes of the matching commits.


Update

Thanks to @torek and @twalberg, I now understand git diff's operation more clearly. The range syntax 29665b0..0b76a27 is indeed misleading, and I now understand that it's not actually performing a cumulative diff over a range of commits. Looking through the docs, I found this:

"diff" is about comparing two endpoints, not ranges, and the range notations (<commit>..<commit> and <commit>...<commit>) do not mean a range as defined in the "SPECIFYING RANGES" section in gitrevisions(7).

Taking this into account, I'll rephrase my question. With these commands:

git log --oneline --first-parent --no-merges --patch   29665b0..0b76a27
git log --oneline --first-parent --no-merges --stat    29665b0..0b76a27
git log --oneline --first-parent --no-merges --numstat 29665b0..0b76a27

the changes are listed individually for each matching commit. How can I combine those individual changes, to produce a cumulative patch/stat/numstat?

The answers to the linked possible duplicate question are helpful, suggesting a solution: create a temporary branch, cherry-pick the relevant commits, and then generate the diff.

I've just posted an answer which uses this technique, but I'm still interested to know if there's a solution which doesn't require a temporary branch?

like image 875
TachyonVortex Avatar asked Aug 20 '14 11:08

TachyonVortex


People also ask

What does ++ mean in git diff?

When viewing a combined diff, if the two files you're comparing have a line that's different from what they were merged into, you will see the ++ to represent: one line that was added does not appear in either file1 or file2.

What is the command to output the differences between the most recent commit and its grandparent?

The git diff command shows the differences between the files in two commits or between your current repository and a previous commit.

How do you see what changes are in a commit?

Find what file changed in a commit To find out which files changed in a given commit, use the git log --raw command. It's the fastest and simplest way to get insight into which files a commit affects.


1 Answers

There is at least one basic misapprehension here. Specifically, git diff is not really cumulative at all: instead, it's simply pairwise.

Specifically, these two commands do the same thing:

git diff rev1 rev2
git diff rev1..rev2

That is, in git diff, there really is no such thing as a range in the first place.


With that out of the way, let's take a look behind the scenes at git log. What git log does with a range is really1 to hand the range to git rev-list, which produces a list of every rev in the range, applying the modifiers along the way:

git rev-list 29665b0..0b76a27

spits out every rev reachable from 0b76a27 that is not also reachable from 29665b0. Adding --first-parent, --max-parents=1 (aka --no-merges), and so on filters away some of the revs that would be listed here.

The final result is given back to git log, which then looks at each revision in the order git rev-list spits them out—this is also controllable via --date-order and --topo-order and so on; see the documentation for git rev-list—and shows you each log entry, perhaps along with a diff as produced by git diff-tree (which for single-parent commits, compares the commit to its parent).

What you can do, then, is invoke git rev-list yourself, directly, and then peel off the top and bottom revisions from its output. (In this particular case you probably want --topo-order too, to make sure that the last rev really is the earliest, graph-wise, regardless of dates.) For instance, in a script:

#! /bin/sh
tempfile=$(mktemp -t mydiff)
trap "rm -f $tempfile" 1 2 3 15
git rev-list 29665b0..0b76a27 --first-parent --no-merges --topo-order > $tempfile
# remember that the first rev listed is the last rev in the range
last=$(head -1 $tempfile)
first=$(tail -1 $tempfile)
rm -f $tempfile # done with it, don't leave it around while showing diff
git diff $first $last

You can get considerably fancier by using git rev-parse to parse options and split them into diff options vs rev-list options, but that's way beyond what you need here. The main thing to improve above is to get rid of the hard-coded revision-range.


1Some git commands really really do hand arguments off to git rev-list, as they're just shell scripts that use git rev-list and other git commands to handle this. Others are built together, so that git log and git rev-list are actually a single binary, and one part hands a job off to another part, but without invoking a new program.

In any case, note that git log master simply hands master off to git rev-list, which produces a list of all revs reachable from the branch-label master. If you add --no-walk, git rev-list produces just one rev, so that git log shows only that one revision.

like image 173
torek Avatar answered Oct 04 '22 09:10

torek