Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Git diff to detect code movement + How to use diff options

Tags:

git

git-diff

Consider that a file (1.c) contains three functions and changes made by authors M and J. If someone runs git blame 1.c, he will get the following output:

^869c699 (M 2012-09-25 14:05:31 -0600  1)  de24af82 (J 2012-09-25 14:23:52 -0600  2)  de24af82 (J 2012-09-25 14:23:52 -0600  3)  de24af82 (J 2012-09-25 14:23:52 -0600  4) public int add(int x, int y)  { de24af82 (J 2012-09-25 14:23:52 -0600  5)    int z = x+y; de24af82 (J 2012-09-25 14:23:52 -0600  6)    return z; de24af82 (J 2012-09-25 14:23:52 -0600  7) }   de24af82 (J 2012-09-25 14:23:52 -0600  8)  ^869c699 (M 2012-09-25 14:05:31 -0600  9) public int multiplication(int y, int z){ ^869c699 (M 2012-09-25 14:05:31 -0600 10)    int result = y*z; ^869c699 (M 2012-09-25 14:05:31 -0600 11)    return temp; ^869c699 (M 2012-09-25 14:05:31 -0600 12) } ^869c699 (M 2012-09-25 14:05:31 -0600 13)  ^869c699 (M 2012-09-25 14:05:31 -0600 14) public void main(){ de24af82 (J 2012-09-25 14:23:52 -0600 15)    //this is a comment de24af82 (J 2012-09-25 14:23:52 -0600 16) } 

Now, if author A changes the position of the multiplication() and add() functions and commits the changes, git blame can detect the code movement. See following output:

$ git blame  -C -M e4672cf82 1.c ^869c699 (M 2012-09-25 14:05:31 -0600  1)  de24af82 (J 2012-09-25 14:23:52 -0600  2)  de24af82 (J 2012-09-25 14:23:52 -0600  3)  e4672cf8 (M 2012-09-25 14:26:39 -0600  4)  de24af82 (J 2012-09-25 14:23:52 -0600  5)  ^869c699 (M 2012-09-25 14:05:31 -0600  6) public int multiplication(int y, int z){ ^869c699 (M 2012-09-25 14:05:31 -0600  7)    int result = y*z; ^869c699 (M 2012-09-25 14:05:31 -0600  8)    return temp; ^869c699 (M 2012-09-25 14:05:31 -0600  9) } ^869c699 (M 2012-09-25 14:05:31 -0600 10)  ^869c699 (M 2012-09-25 14:05:31 -0600 11) public void main(){ de24af82 (J 2012-09-25 14:23:52 -0600 12)    //this is a comment e4672cf8 (M 2012-09-25 14:26:39 -0600 13) } de24af82 (J 2012-09-25 14:23:52 -0600 14) public int add(int x, int y){ de24af82 (J 2012-09-25 14:23:52 -0600 15)    int z = x+y; de24af82 (J 2012-09-25 14:23:52 -0600 16)    return z; e4672cf8 (M 2012-09-25 14:26:39 -0600 17) } 

However, if I try to run git diff between these two revisions, it cannot detect that functions change their location and gives the following output:

$ git diff -C -M de24af8..e4672cf82 1.c  diff --git a/1.c b/1.c index 5b1fcba..56b4430 100644 --- a/1.c +++ b/1.c @@ -1,10 +1,7 @@    -public int add(int x, int y){ -       int z = x+y; -       return z; -}       +  public int multiplication(int y, int z){     int result = y*z; @@ -13,4 +10,8 @@ public int multiplication(int y, int z){   public void main(){     //this is a comment -} \ No newline at end of file +} +public int add(int x, int y){ +       int z = x+y; +       return z; +}       \ No newline at end of file 

My questions are:

  1. How can I enforce detecting code movement in getting diff output? Is it even possible?

  2. Git diff can be applied with several options. For example --minimal, --patience. How can I apply those options here? I tried with one, but get the following error:

    $ git diff --minimal de24af8..e4672cf82 1.c usage: git diff <options> <rev>{0,2} -- <path>* 

Can anyone suggest/give sample example how to add these options correctly?

like image 266
Muhammad Asaduzzaman Avatar asked Sep 25 '12 20:09

Muhammad Asaduzzaman


People also ask

How do you check the diff of a commit?

To see the diff for a particular COMMIT hash, where COMMIT is the hash of the commit: git diff COMMIT~ COMMIT will show you the difference between that COMMIT 's ancestor and the COMMIT .

Can you git diff two files?

The git diff command is used to perform the diff function on Git data sources. For example, commits, branches, files, and so on. It can also be used to compare two files of different branches.


2 Answers

As of Git 2.15, git diff now supports detection of moved lines with the --color-moved option. It even detects moves between files.

It works, obviously, for colorized terminal output. As far as I can tell, there is no option to indicate moves in plain text patch format, but that makes sense.

For default behavior, try

git diff --color-moved 

The command also takes options, which currently are no, default, plain, zebra and dimmed_zebra (Use git help diff to get the latest options and their descriptions). For example:

git diff --color-moved=zebra 
like image 84
Inigo Avatar answered Oct 10 '22 11:10

Inigo


This was the best answer at the time it was written, but it is no longer accurate. In 2017, Git 2.15 upgraded its diff to do move detection. As explained in the now top voted answer, use git diff --color-moved

Original answer:

What you're running up against here is that Git largely stays out of advanced diffing like this. There's a reason Git allows configuration of external diff and merge tools: you'd go insane without their assistance. Beyond Compare and Araxis Merge would both catch this movement, as an example.

The general class of problem you're looking to solve is a "structured merge": Structural Diff of two java source files

You might have a bit more luck with git-format-patch than with git-diff in this case because the former provides more commit info, including author and commit message and also generates a patch file for each commit in the range you specify. Source: What is the difference between 'git format-patch and 'git diff'?

If you're looking for tips on detecting code moves generally, it's interesting to note that detection of code movement is explicitly not a goal of the all-powerful pickaxe. See this interesting exchange: http://gitster.livejournal.com/35628.html

If you wanted to detect who swapped the order, it seems your only option would be to do something like:

 git log -S'public int multiplication(int y, int z){     int result = y*z;     return temp;  }   public void main(){     //this is a comment  }  public int add(int x, int y)  {     int z = x+y;     return z;  }' 

What you're looking for is git blame -M<num> -n, which does something pretty similar to what you're asking:

-M|<num>|        Detect moved or copied lines within a file. When a commit moves or        copies a block of lines (e.g. the original file has A and then B,        and the commit changes it to B and then A), the traditional blame        algorithm notices only half of the movement and typically blames        the lines that were moved up (i.e. B) to the parent and assigns        blame to the lines that were moved down (i.e. A) to the child        commit. With this option, both groups of lines are blamed on the        parent by running extra passes of inspection.         <num> is optional but it is the lower bound on the number of        alphanumeric characters that git must detect as moving/copying        within a file for it to associate those lines with the parent        commit. The default value is 20.  -n, --show-number        Show the line number in the original commit (Default: off). 
like image 26
kayaker243 Avatar answered Oct 10 '22 09:10

kayaker243