Compare 2 Unix Files and Output Matching Lines to a New File?

Question

I have 2 nix files. All of the data is on one single line in each file. Each value is separated by a null character. Some off the values in the data match.

How would I parse this data into a new file listing only the matching values ?

I figure I could use sed to change the null characters into newlines ? From there on I'm not real sure...

Any ideas ?

holygeek · Accepted Answer

Use tr, sort and comm:

Convert nulls into new lines, and sort the result:

$ tr '\000' '
' < file1 | sort > file1.txt
$ tr '\000' '
' < file2 | sort > file2.txt

then use comm to get the lines that are common to both file:

$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>

Barton Chittenden · Answer

If there are no duplicate values within file1 or file2, you can do this:

( tr '\0' '
' < file1; tr '\0' '
' < file2 ) | sort | uniq -c | egrep -v '^ +1'

This will count all of the duplicate values between the two files.

If the order of the fields is important, you can do this:

comm -1 -2 <(tr '\0' '
' < file1) <(tr '\0' '
' < file2)

This approach is not portable, it requires the 'process substitution' feature of Bash.

Compare 2 Unix Files and Output Matching Lines to a New File?

Tags:

linux

bash

sed

awk

perl

rreeves

2 Answers

holygeek

Barton Chittenden

Recent Activity

Donate For Us

Compare 2 Unix Files and Output Matching Lines to a New File?

Tags:

linux

bash

sed

awk

perl

rreeves

2 Answers

holygeek

Barton Chittenden

Related questions

Recent Activity

Donate For Us