Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk statements - if not found (grep'ed) do

Tags:

awk

My table file example looks like that

Name1   xxxxx  34
Name1   xxxxx  37
Name2   aaaaa  59
Name2   xxxxx  90
Name4   Name3  12

Name file looks like that

Name1 
Name2
Name3
Name4 

I want awk to match Name1/2/3/4 from Name file to table file $1 and print sum of $3. If Name is not found print 0 - how can I do such if statement in awk?

What I already done:

for i in $(cat Name_file)
do 
cat table | awk -v NAME="$i" '($1==NAME) {SUM+=$3} END {print NAME"\t"SUM}'
done

Gives output

Name1   71
Name2   149
Name3   
Name4   12

It's almost perfect - I want to add 0 to Name3 to get such output

Name1   71
Name2   149
Name3   0
Name4   12

So much question is: How to add if not found do function in awk?

like image 671
Paul Gardir Avatar asked Jun 13 '13 09:06

Paul Gardir


2 Answers

Y do not need any 'not found' behaviour. You're just not properly initialised SUM variable before counting. Use BEGIN {SUM = 0} for that.

If you need found/not found behaviour explicitly, do it similarly. First, initialize some variable BEGIN {FOUND = 0} then change it some way on pattern match: (...) {FOUND = FOUND+1} and finally test it with if(FOUND!=0).

like image 140
Vovanium Avatar answered Oct 15 '22 02:10

Vovanium


Try sg like this:

awk 'NR==FNR{a[$1]=0;next}$1 in a{a[$1]+=$3}END{for(i in a) print i,a[i]}' Name_file table

Output:

Name1 71
Name2 149
Name3 0
Name4 12

In this case you do not need the bash loop around awk. It reads the Names_table first then process all lines of table in one step. So it is much more effective.

ADDED

Or a pure bash (>= 4.0) solution:

printf -v tmp "[%s]=0 " $(<Name_file)
declare -A htmp
eval htmp=($tmp)
while read a b c; do [ -n "${htmp[$a]}" ] && ((htmp[$a] += $c)); done <table
for i in ${!htmp[*]}; do echo $i ${htmp[$i]}; done

EXTENDED

The extended question was to group by $1 and $2 (and Name_file contains all first keys from table, so it is not really needed to process).

cat >table <<XXX
Name1   xxxxx  34
Name1   xxxxx  37
Name2   aaaaa  59
Name2   xxxxx  90
Name4   Name3  12
XXX

awk -v SUBSEP=, '{a[$1,$2]+=$3;++n[$1,$2]}END{for(i in a) print i,a[i],n[i]}' table

Output:

Name2,xxxxx 90 1
Name2,aaaaa 59 1
Name4,Name3 12 1
Name1,xxxxx 71 2
like image 41
TrueY Avatar answered Oct 15 '22 02:10

TrueY