Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Variable evaluation before assignment in awk

Tags:

awk

In the following awk statement:

awk '$2 > maxrate {maxrate = $2; maxemp = $1} 
     END {print "highest hourly rate:", maxrate, "for", maxemp}' pay.data

run on the following data:

Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18

How does $2 > maxrate works since it is evaluated before its assignment to $2?

like image 461
Thierry Huysman Avatar asked Dec 31 '22 20:12

Thierry Huysman


1 Answers

From the GNU awk manual

By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly initialize a variable in awk, which is what you would do in C and in most other traditional languages.

This implicit way, which usually apply for scripting languages, is very comfortable but also leaves room for mistakes or confusion.


For example, in this case, you can calculate the maximum, with no need to initialise max:

awk '$2 > max{max = $2} END{print "max:", max}' file
max: 5.50

But if you do the same for the min you get the empty string as result, because min is initially zero as a number and empty as a string.

awk '$2 < min{min = $2} END{print "min:", min}' file
min: 

Also the max calculation could fail, if we had all values negative. So it would be better to assign something first time for sure.

awk 'NR==1{min=$2; next} $2<min{min = $2} END{print "min:", min}' file
min: 3.75

This way should work for both min and max, for numbers of any range. In general, when scripting, we have to think of all possible cases when our not defined and/or not initialised variable will be initialised. And for the cases that it will be tested before getting a value.

like image 159
thanasisp Avatar answered Jan 14 '23 12:01

thanasisp