Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Awk and multiple file processing possible?

I need to process two file contents. I was wondering if we can pull it off using a single nawk statement.

File A contents:

AAAAAAAAAAAA  1
BBBBBBBBBBBB  2
CCCCCCCCCCCC  3

File B contents:

XXXXXXXXXXX  3
YYYYYYYYYYY  2
ZZZZZZZZZZZ  1

I would like to compare if $2 (2nd field ) in file A is the reverse of $2 in file B. I was wondering how to write rules in nawk for multi-file processing ? How would we distinguish A's $2 from B's $2

EDIT: I need to compare $2 of A's first line (which is 1) with the $2 of B's last line (which is 1 again) .Then compare $2 of line 2 in A with $2 in NR-1 th line of B. And so on.....

like image 511
tomkaith13 Avatar asked Dec 14 '11 07:12

tomkaith13


People also ask

How do I process multiple files in awk?

Yes, you can read from multiple files at the same time using awk. In order to do that, you should use the getline command to explicitly control input. In particular, you want to use getline with a file so that you're not reading from the file(s) passed in as main arguments to awk.

Can awk write to file?

Redirections in awk are written just like redirections in shell commands, except that they are written inside the awk program. This redirection prints the items into the output file named output-file . The file name output-file can be any expression.

Is the section processed after the file processing is over in awk?

Although the awk default is to perform all commands on each record, awk also allows actions to be performed before the first record is read, and/or after the last record is processed. Commands to be executed at the beginning or end of the records are set off by the key words BEGIN and END.

Does awk modify the file?

(The actual input is untouched; awk never modifies the input file.) Consider the following example and its output: $ awk '{ nboxes = $3 ; $3 = $3 - 10 > print nboxes, $3 }' inventory-shipped 25 15 32 22 24 14 …


2 Answers

You can do something like this -

[jaypal:~/Temp] cat f1
AAAAAAAAAAAA  1
BBBBBBBBBBBB  2
CCCCCCCCCCCC  3
DDDDDDDDDDDD  4

[jaypal:~/Temp] cat f2
AAAAAAAAAAA  5
XXXXXXXXXXX  3
YYYYYYYYYYY  2
ZZZZZZZZZZZ  1

Solution:

awk '
NR==FNR {a[i++]=$2; next}
{print (a[--i] == $2 ? "Match " $2 FS a[i] : "Do not match " $2 FS a[i])}' FileB FileA
Match 1 1
Match 2 2
Match 3 3
Do not match 4 5
like image 129
jaypal singh Avatar answered Nov 27 '22 10:11

jaypal singh


You can make awk process files serially, but you can't easily make it process two files in parallel. You probably can achieve the effect with careful use of getline but 'careful' is the operative term.

I think in this case, with simple two-column files, I'd be inclined to use:

paste "File A" "File B" |
awk '{ process fields $1, $2 from File A and fields $3, $4 from file B }'

You would need to make sure the two files are in the appropriate order, etc.

If your input is more complex, then this may not work so well, though you can choose the character that separates the data from the two files with paste -d'|' ... to use a pipe to separate the two records, and awk -F'|' '{ ... }' to read $1 as the info from File A and $2 as the info from File B.

like image 38
Jonathan Leffler Avatar answered Nov 27 '22 09:11

Jonathan Leffler