I'm sure I once found a shell command which could print the common lines from two or more files. What is its name? It was much simpler than <code>diff</code>.

The command you are seeking is <code>comm</code>. eg:- <pre class="prettyprint"><code>comm -12 1.sorted.txt 2.sorted.txt </code></pre> Here: -1 : suppress column 1 (lines unique to 1.sorted.txt) -2 : suppress column 2 (lines unique to 2.sorted.txt)

To easily apply the comm command to unsorted files, use Bash's process substitution: <pre class="prettyprint"><code>$ bash --version GNU bash, version 3.2.51(1)-release Copyright (C) 2007 Free Software Foundation, Inc. $ cat > abc 123 567 132 $ cat > def 132 777 321 </code></pre> So the files abc and def have one line in common, the one with "132". Using comm on unsorted files: <pre class="prettyprint"><code>$ comm abc def 123 132 567 132 777 321 $ comm -12 abc def # No output! The common line is not found $ </code></pre> The last line produced no output, the common line was not discovered. Now use comm on sorted files, sorting the files with process substitution: <pre class="prettyprint"><code>$ comm <( sort abc ) <( sort def ) 123 132 321 567 777 $ comm -12 <( sort abc ) <( sort def ) 132 </code></pre> Now we got the 132 line!

To complement the Perl one-liner, here's its <code>awk</code> equivalent: <pre class="prettyprint"><code>awk 'NR==FNR{arr[$0];next} $0 in arr' file1 file2 </code></pre> This will read all lines from <code>file1</code> into the array <code>arr[]</code>, and then check for each line in <code>file2</code> if it already exists within the array (i.e. <code>file1</code>). The lines that are found will be printed in the order in which they appear in <code>file2</code>. Note that the comparison <code>in arr</code> uses the entire line from <code>file2</code> as index to the array, so it will only report exact matches on entire lines.

Shell command to find lines common in two files

4 Answers

The command you are seeking is comm. eg:-

comm -12 1.sorted.txt 2.sorted.txt

Here:

-1 : suppress column 1 (lines unique to 1.sorted.txt)

-2 : suppress column 2 (lines unique to 2.sorted.txt)

177

answered Oct 17 '22 09:10

Jonathan Leffler

To easily apply the comm command to unsorted files, use Bash's process substitution:

$ bash --version
GNU bash, version 3.2.51(1)-release
Copyright (C) 2007 Free Software Foundation, Inc.
$ cat > abc
123
567
132
$ cat > def
132
777
321

So the files abc and def have one line in common, the one with "132". Using comm on unsorted files:

$ comm abc def
123
    132
567
132
    777
    321
$ comm -12 abc def # No output! The common line is not found
$

The last line produced no output, the common line was not discovered.

Now use comm on sorted files, sorting the files with process substitution:

$ comm <( sort abc ) <( sort def )
123
            132
    321
567
    777
$ comm -12 <( sort abc ) <( sort def )
132

Now we got the 132 line!

answered Oct 17 '22 10:10

Stephan Wehner

To complement the Perl one-liner, here's its awk equivalent:

awk 'NR==FNR{arr[$0];next} $0 in arr' file1 file2

This will read all lines from file1 into the array arr[], and then check for each line in file2 if it already exists within the array (i.e. file1). The lines that are found will be printed in the order in which they appear in file2. Note that the comparison in arr uses the entire line from file2 as index to the array, so it will only report exact matches on entire lines.

answered Oct 17 '22 10:10

Tatjana Heuser

Maybe you mean comm ?

Compare sorted files FILE1 and FILE2 line by line.

With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.

The secret in finding these information are the info pages. For GNU programs, they are much more detailed than their man-pages. Try info coreutils and it will list you all the small useful utils.

answered Oct 17 '22 11:10

Johannes Schaub - litb

Related questions
                            
                                Shell script to delete directories older than n days
                            
                                Worth switching to zsh for casual use? [closed]
                            
                                Quick-and-dirty way to ensure only one instance of a shell script is running at a time
                            
                                How to assign the output of a Bash command to a variable? [duplicate]
                            
                                Shell Script: Execute a python program from within a shell script
                            
                                How to pass arguments to Shell Script through docker run
                            
                                How to remove all .svn directories from my application directories
                            
                                How do I use the lines of a file as arguments of a command?
                            
                                Bash script processing limited number of commands in parallel
                            
                                Check if passed argument is file or directory in Bash
                            
                                How to use find command to find all files with extensions from list?
                            
                                How to retrieve absolute path given relative
                            
                                Piping command output to tee but also save exit code of command [duplicate]
                            
                                How to kill zombie process
                            
                                Remove duplicate entries in a Bash script [duplicate]
                            
                                Convert command line arguments into an array in Bash
                            
                                Efficiently test if a port is open on Linux?
                            
                                String comparison in bash. [[: not found
                            
                                How to do a non-greedy match in grep?
                            
                                How to split one string into multiple variables in bash shell? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Shell command to find lines common in two files

Tags:

shell

command-line

too much php

People also ask