Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Awk summation is carried out without floating point precision

Tags:

bash

awk

sum

I have a file.txt and I'm trying to summarize the values of the fourth and fifth columns:

^20170821^3007030^153^863.53^0.42^
^20170821^1402675^110^581.36^0.37^
^20170821^1404785^24^155.29^0.29^
^20170821^1406505^40^210.51^0.00^
^20170821^1005^1^18.00^0.00^
^20170821^9657^7^7.28^0.00^
^20170821^143646^86^486.59^0.08^
^20170821^342657^3^12.60^0.00^
^20170821^1006^4^7.04^0.04^
^20170821^1004^1215^3502.44^12.09^
^20170821^1007^932^6689.64^15.07^
^20170821^378228^1^2.80^0.00^
^20170821^704797^4^23.80^0.00^
^20170821^705642^2^9.80^0.00^
^20170821^703689^7^40.60^0.00^
^20170821^148340^75^382.81^0.20^
^20170821^257^2^5.60^0.00^
^20170821^3702^1^2.80^0.00^
^20170821^3703^1^7.00^0.00^
^20170821^258^1^7.00^0.00^
^20170821^920299^11^60.20^0.00^
^20170821^210705^2^14.00^0.00^
^20170821^867693^12^65.88^0.08^
^20170821^2635085^6^33.60^0.00^
^20170821^13211^140^409.18^0.58^
^20170821^64^2^14.00^0.00^
^20170821^13214^234^1685.91^1.26^
^20170821^13212^2^34.90^0.00^
^20170821^13213^2^2.80^0.00^
^20170821^18385^8^7.28^0.00^


 $awk -F '^' '{sum += $5} END {print sum}' file.txt

I get the following result: 15344.2

 $awk -F '^' '{sum += $6} END {print sum}' file.txt

I get the following result: 30.48

Then I checked the result in the Excel. It turned out that the awk addition was wrong in the first addition. Missing 0.04.

enter image description here

How to sum the column correctly?

like image 323
Andrey Avatar asked Dec 19 '22 04:12

Andrey


2 Answers

Don't leave the floating point precision by using print in Awk, when you can use printf() with format-modifiers. E.g. do the same with printf with 2 digit precision control as below

awk -F '^' '{ sum += $5 } END { printf "%0.2f",sum }' file
15344.24

from the GNU Awk - Modifiers for printf Formats under section precision states,

.prec

A period followed by an integer constant specifies the precision to use when printing. The meaning of the precision varies by control letter:


As a side note you can use print statement with a special awk variable OFMT as specified below,

GNU Awk - Controlling Numeric Output with print

The predefined variable OFMT contains the format specification that print uses with sprintf() when it wants to convert a number to a string for printing. The default value of OFMT is "%.6g". The way print prints numbers can be changed by supplying a different format specification for the value of OFMT

This way your example can be modified to define the OFMT in the BEGIN clause to print with two digit precision control, i.e.

awk -F '^' 'BEGIN { OFMT="%.2f" }{ sum += $5 } END { print sum }' file
15344.24

This option is available in all POSIX compliant Awk versions.

like image 180
Inian Avatar answered Dec 22 '22 00:12

Inian


Try this:

awk -F '^' '{sum += $5} END {printf "%.2f\n", sum}' file.txt 

"%.2f" will make sure it is rounded to 2 decimal places.

like image 38
Dan Kreiger Avatar answered Dec 21 '22 22:12

Dan Kreiger