Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two files line by line and generate the difference in another file

Tags:

shell

unix

People also ask

Which command shows the differences between two files line by line?

diff stands for difference. This command is used to display the differences in the files by comparing the files line by line.

Which command is used to compare files line by line?

The comm command is used to compare two sorted files line by line and writes three columns to standard output.

How do I compare data between two files?

From the Micro Focus Data File Tools window, click Tools > Compare Files. The File Compare dialog box appears. Select the two data files to compare: In the File 1 section, click and select the required file.


diff(1) is not the answer, but comm(1) is.

NAME
       comm - compare two sorted files line by line

SYNOPSIS
       comm [OPTION]... FILE1 FILE2

...

       -1     suppress lines unique to FILE1

       -2     suppress lines unique to FILE2

       -3     suppress lines that appear in both files

So

comm -2 -3 file1 file2 > file3

The input files must be sorted. If they are not, sort them first. This can be done with a temporary file, or...

comm -2 -3 <(sort file1) <(sort file2) > file3

provided that your shell supports process substitution (bash does).


The Unix utility diff is meant for exactly this purpose.

$ diff -u file1 file2 > file3

See the manual and the Internet for options, different output formats, etc.


Consider this:
file a.txt:

abcd
efgh

file b.txt:

abcd

You can find the difference with:

diff -a --suppress-common-lines -y a.txt b.txt

The output will be:

efgh 

You can redirict the output in an output file (c.txt) using:

diff -a --suppress-common-lines -y a.txt b.txt > c.txt

This will answer your question:

"...which contains the lines in file1 which are not present in file2."


Sometimes diff is the utility you need, but sometimes join is more appropriate. The files need to be pre-sorted or, if you are using a shell which supports process substitution such as bash, ksh or zsh, you can do the sort on the fly.

join -v 1 <(sort file1) <(sort file2)