Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using AWK to Process Input from Multiple Files

Tags:

awk

Many people have been very helpful by posting the following solution for AWK'ing multiple input files at once:

$ awk 'FNR==NR{a[$1]=$2 FS $3;next}{ print $0, a[$1]}' file2 file1 

This works well, but I was wondering if I someone could explain to me why? I find the AWK syntax a little bit tough to get the hang of and was hoping someone wouldn't mind breaking the code snippet down for me.

like image 501
jkovba Avatar asked Feb 20 '13 15:02

jkovba


People also ask

Can awk read from multiple files?

Yes, you can read from multiple files at the same time using awk. In order to do that, you should use the getline command to explicitly control input.

How do I pass multiple variables in awk?

awk programming -Passing variable to awk for loop awk: BEGIN -------- <some code here> END{ ----------<some code here> for(N=0; N<H; N++) { for(M=5; M<D; M++) print "\t" D ""; } ----- } ... Discussion started by ctrld and has been viewed 2,402 times.

What is awk '{ print $1 }'?

If you notice awk 'print $1' prints first word of each line. If you use $3, it will print 3rd word of each line.


2 Answers

awk 'FNR==NR{a[$1]=$2 FS $3;next} 

here we handle the 1st input (file2). say, FS is space, we build an array(a) up, index is column1, value is column2 " " column3 the FNR==NR and next means, this part of codes work only for file2. you could man gawk check what are NR and FNR

{ print $0, a[$1]}' file2 file1 

When NR != FNR it's time to process 2nd input, file1. here we print the line of file1, and take column1 as index, find out the value in array(a) print. in another word, file1 and file2 are joined by column1 in both files.

for NR and FNR, shortly,

1st input has 5 lines 2nd input has 10 lines,  NR would be 1,2,3...15 FNR would be 1...5 then 1...10 

you see the trick of FNR==NR check.

like image 76
Kent Avatar answered Oct 07 '22 08:10

Kent


I found this question/answer on Google and it appears to be referring to a very specific data set found in another question (How to merge two files using AWK?). What follows is the answer I was looking for (and that I think most people would be), i.e., simply to concatenate every line from two different files using AWK. Though you could probably use some UNIX utilities like join or paste, AWK is obviously much more flexible and powerful if your desired output is different, by using if statements, or altering the OFS (which may be more difficult to do depending on the utility; see below) for example, altering the output in a much more expressive way (an important consideration for shell scripters.)

For simple line-by-line concatenation:

awk 'FNR==NR { a[FNR""] = $0; next } { print a[FNR""], $0 }' file1 file2

This emulates the function of a numerically indexed array (AWK only has associative arrays) by using implicit type conversion. It is relatively expressive and easy to understand.

Using two files called test1 and test2 with the following lines:

test1:

line one line two line three 

test2:

line four line five line six 

I get this result:

line one line four line two line five line three line six 

Depending on how you want to join the values between the columns in the output, you can pick the appropriate output field separator. Here's an example with ellipses (...) separating the columns:

awk 'BEGIN { OFS="..."} FNR==NR { a[(FNR"")] = $0; next } { print a[(FNR"")], $0 }' test1 test2

Yielding this result:

line one...line four line two...line five line three...line six 

I hope at least that this inspires you all to take advantage of the power of AWK!

like image 43
Amiga500Kid Avatar answered Oct 07 '22 08:10

Amiga500Kid