Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWK sum of large integers

Tags:

awk

I am trying to sum a list of integers from a logfile using awk

{sum+=$1} END {print sum}

The problem is that the result is larger than the MAX_INT specified in my limits.h file, so the print returns 3.68147e+09

Is there an elegant way of printing the entire value of the sum?

Thank you!

like image 362
Alex Avatar asked Jan 22 '14 08:01

Alex


People also ask

How do you use begin and end in awk?

BEGIN pattern: means that Awk will execute the action(s) specified in BEGIN once before any input lines are read. END pattern: means that Awk will execute the action(s) specified in END before it actually exits.

How do you skip the first row in awk?

The '-F' option is used to separate the content of the file base on \t. NR is used to skip the first line, and NF is used to print the first column only. The following output will be produced after running the above commands.


2 Answers

gnu awk has -M option, you can try with it. it should keep the precision for you.

The MPFR and MP libraries should be used when you compile gawk, not at run time.

here is an example, with or without -M. tested with gawk 4.1.0 on 64bit linux (Archlinux):

kent$  awk 'BEGIN{printf "%d\n","368147000099999999999999999999999999"}'  
368147000099999983291776543710248960

kent$  awk -M 'BEGIN{printf "%d\n","368147000099999999999999999999999999"}'
368147000099999999999999999999999999
like image 149
Kent Avatar answered Nov 09 '22 10:11

Kent


awk does not have integral type that is large enough for your data, and promotes the sum into floating point. As far as I know, there is no data type in awk with enough precision for what you ask. I.e. the problem is not in printing; awk literally does not have the information you want.

You can try ruby instead, for example (it promotes integers into big integers rather than into floats):

ruby -nae 'BEGIN{sum=0}; END{puts sum}; sum+=$F[0].to_i'
like image 38
Amadan Avatar answered Nov 09 '22 11:11

Amadan