Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare 2 Unix Files and Output Matching Lines to a New File?

I have 2 nix files. All of the data is on one single line in each file. Each value is separated by a null character. Some off the values in the data match.

How would I parse this data into a new file listing only the matching values ?

I figure I could use sed to change the null characters into newlines ? From there on I'm not real sure...

Any ideas ?

like image 869
rreeves Avatar asked Jan 04 '12 04:01

rreeves


2 Answers

Use tr, sort and comm:

Convert nulls into new lines, and sort the result:

$ tr '\000' '\n' < file1 | sort > file1.txt
$ tr '\000' '\n' < file2 | sort > file2.txt

then use comm to get the lines that are common to both file:

$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>
like image 65
holygeek Avatar answered Sep 19 '22 12:09

holygeek


If there are no duplicate values within file1 or file2, you can do this:

( tr '\0' '\n' < file1; tr '\0' '\n' < file2 ) | sort | uniq -c | egrep -v '^ +1'

This will count all of the duplicate values between the two files.

If the order of the fields is important, you can do this:

comm -1 -2 <(tr '\0' '\n' < file1) <(tr '\0' '\n' < file2)

This approach is not portable, it requires the 'process substitution' feature of Bash.

like image 28
Barton Chittenden Avatar answered Sep 21 '22 12:09

Barton Chittenden