Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to understand diff -u in linux?

example code

diff -r -u -P a.c b.c > diff.patch

I've tried to search in man.

man says that diff -u is to unify the pattern of output, what is the meaning of that and when should we use it?

thanks a lot.

like image 868
Nicki Wei Avatar asked Mar 17 '15 08:03

Nicki Wei


People also ask

How does diff work on Linux?

The Linux diff command is used to compare two files line by line and display the difference between them. This command-line utility lists changes you need to apply to make the files identical. Read on to learn more about the diff command and its options with easy-to-follow examples.

How does diff tool work?

The diff command is invoked from the command line, passing it the names of two files: diff original new . The output of the command represents the changes required to transform the original file into the new file. If original and new are directories, then diff will be run on each file that exists in both directories.

How does diff Work Unix?

On Unix-like operating systems, the diff command analyzes two files and prints the lines that are different. In essence, it outputs a set of instructions for how to change one file to make it identical to the second file.


2 Answers

From Wikipedia (diff utility):

The unified format (or unidiff) inherits the technical improvements made by the context format, but produces a smaller diff with old and new text presented immediately adjacent. Unified format is usually invoked using the "-u" command line option. This output is often used as input to the patch program. Many projects specifically request that "diffs" be submitted in the unified format, making unified diff format the most common format for exchange between software developers.

...

The format starts with the same two-line header as the context format, except that the original file is preceded by "---" and the new file is preceded by "+++". Following this are one or more change hunks that contain the line differences in the file. The unchanged, contextual lines are preceded by a space character, addition lines are preceded by a plus sign, and deletion lines are preceded by a minus sign.

A hunk begins with range information and is immediately followed with the line additions, line deletions, and any number of the contextual lines. The range information is surrounded by double-at signs, and combines onto a single line what appears on two lines in the context format (above). The format of the range information line is as follows:

    @@ -l,s +l,s @@ optional section heading

...

The idea of any format that diff throws at you is to transform a source file into a destination file following a series of steps. Let's see a simple example of how this works with unified format.

Given the following files:

from.txt

a
b

to.txt

a
c

The output of diff -u from.txt to.txt is:

--- frokm.txt   2015-03-17 04:34:47.076997087 -0430
+++ to.txt      2015-03-17 04:35:27.872996388 -0430
@@ -1,2 +1,2 @@
 a
-b
+c

Explanation. Header description:

--- from.txt    2015-03-17 22:42:18.575039925 -0430  <-- from-file time stamp
+++ to.txt      2015-03-17 22:42:10.495040064 -0430  <-- to-file time stamp

This diff contains just one hunk (only one set of changes to turn file form.txt into to.txt):

@@ -1,2 +1,2 @@  <-- A hunk, a block describing chages between both files, there could be several of these in the diff -u output
   ^    ^
   |   (+) means that this change starts at line 1 and involves 2 lines in the to.txt file
  (-) means that this change starts at line 1 and involves 2 lines of the from.txt file

Next, the list of changes:

 a   <-- This line remains the same in both files, so it won't be changed
-b   <-- This line has to be removed from the "from.txt" file to transform it into the "to.txt" file
+c   <-- This line has to be added to the "from.txt" file to transform it into the "to.txt" file

Here are some StackOverflow answers with really nice info about this subject:

https://stackoverflow.com/a/10950496/1041822
https://stackoverflow.com/a/2530012/1041822

And some other useful documentation:

https://linuxacademy.com/blog/linux/introduction-using-diff-and-patch/ http://www.artima.com/weblogs/viewpost.jsp?thread=164293

like image 75
higuaro Avatar answered Oct 13 '22 14:10

higuaro


The term unified was made up. Better, perhaps would have been to call it "concise".

The point of diff -u is that it is a more concise representation than context diff. Quoting from the original description of Wayne Davison's posting of unidiff to comp.sources.misc (volume 14, 31 Aug 90):

I've created a new context diff format that combines the old and new chunks into 
one unified hunk.  The result?  The unified context diff, or "unidiff."         
                                                                            
Posting your patch using a unidiff will usually cut its size down by around     
25% (I've seen from 12% to 48%, depending on how many redundant context lines   
are removed).  Even if the diffs are generated with only 2 lines of context,    
the savings still average around 20%.                                           
                                                                            
Keep in mind that *no information is lost* by the conversion process.  Only
the redundancy of having multiple identical context lines.  [...]

Here are some useful links:

  • How to read a patch or diff and understand its structure to apply it manually
  • What is the format of a patch file?

Not useful (and misleading)

  • 2.2.2 Unified Format, which appears to omit attribution.
like image 3
Thomas Dickey Avatar answered Oct 13 '22 13:10

Thomas Dickey