Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare files with awk

Hi I have two similar files (both with 3 columns). I'd like to check if these two files contains the same elements (but listed in a different orders). First of all I'd like to compare only the 1st columns

file1.txt

"aba" 0 0 
"abc" 0 1
"abd" 1 1 
"xxx" 0 0

file2.txt

"xyz" 0 0
"aba" 0 0
"xxx" 0 0
"abc" 1 1

How can I do it using awk? I tried to have a look around but I've found only complicate examples. What if I want to include also the other two columns on the comparison? The output should give me the number of matching elements.

like image 974
Titus Pullo Avatar asked Feb 25 '13 11:02

Titus Pullo


2 Answers

To print the common elements in both files:

$ awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
"aba"
"abc"
"xxx"

Explanation:

NR and FNR are awk variables that store the total number of records and the number of records in the current files respectively (the default record is a line).

NR==FNR # Only true when in the first file 
{
    a[$1] # Build associative array on the first column of the file
    next  # Skip all proceeding blocks and process next line
}
($1 in a) # Check in the value in column one of the second files is in the array
{
    # If so print it
    print $1
}

If you want to match the whole lines then use $0:

$ awk 'NR==FNR{a[$0];next}$0 in a{print $0}' file1 file2
"aba" 0 0
"xxx" 0 0

Or a specific set of columns:

$ awk 'NR==FNR{a[$1,$2,$3];next}($1,$2,$3) in a{print $1,$2,$3}' file1 file2
"aba" 0 0
"xxx" 0 0
like image 52
Chris Seymour Avatar answered Sep 20 '22 15:09

Chris Seymour


To print the number of matching elements, here's one way using awk:

awk 'FNR==NR { a[$1]; next } $1 in a { c++ } END { print c }' file1.txt file2.txt

Results using your input:

3

If you'd like to add extra columns (for example, columns one, two and three), use a pseudo-multidimensional array:

awk 'FNR==NR { a[$1,$2,$3]; next } ($1,$2,$3) in a { c++ } END { print c }' file1.txt file2.txt

Results using your input:

2
like image 43
Steve Avatar answered Sep 20 '22 15:09

Steve