Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an awk equivalent of INT_MIN and INT_MAX?

Tags:

awk

gawk

In C and Java, there are defined constants representing the maximum and minimum values an integer can hold.

Are there such constants in awk? If so, what are their names?

The awk manual indicates that awk can support arbitrary precision integer arithmetic with -M, but I'd like to know about the bounds on integers when we do not specify -M.

like image 339
merlin2011 Avatar asked Apr 27 '18 21:04

merlin2011


People also ask

What is INT_MAX and INT_MIN?

INT_MAX is a macro that specifies that an integer variable cannot store any value beyond this limit. INT_MIN specifies that an integer variable cannot store any value below this limit. Values of INT_MAX and INT_MIN may vary from compiler to compiler.

Where is INT_MAX defined?

INT_MAX is a macro which represents the maximum integer value. Similarly, INT_MIN represents the minimum integer value. These macros are defined in the header file <limits.

What happens when you try to assign a value larger?

(Arithmetic) Integer Overflows An integer overflow occurs when you attempt to store inside an integer variable a value that is larger than the maximum value the variable can hold. The C standard defines this situation as undefined behavior (meaning that anything might happen).


2 Answers

Not really something I've considered before so I may be barking up the wrong tree completely but since awk uses double-precision floating-point numbers by default, maybe what you're looking for is based on the value of PREC in gawk (see https://www.gnu.org/software/gawk/manual/gawk.html#Setting-precision). Look:

$ awk 'BEGIN{print PREC}'
53

$ awk 'BEGIN{print (2^52)}'
4503599627370496
$ awk 'BEGIN{print (2^52)+1}'
4503599627370497

$ awk 'BEGIN{print (2^PREC)}'
9007199254740992
$ awk 'BEGIN{print (2^PREC)+1}'
9007199254740992

Notice how integer arithmetic fails when you try to go beyond 2^PREC? So maybe 2^PREC is a reasonable value to use for a MAX_INT equivalent and you could derive a MIN_INT similarly. Think about it, try it, see if it makes sense for your needs....

like image 200
Ed Morton Avatar answered Sep 24 '22 08:09

Ed Morton


High integers in current (g)awk are oddly broken without -M. It is easy to spot that BEGIN {print 2^1024} yields inf, whereas BEGIN {print 2^1023} works. One would therefore assume that the maximum integer in this particular implementation is 21024 − 1. Yet this is not the case.

A simple experiment, based on the fact that 21024 − 1 &equals; 21023 &plus; 21022 &plus; ⋯ &plus; 21 &plus; 20:

BEGIN {for (i = 1023; i >= 0; --i) sum += 2^i; print sum}

This^^^ yields infinity, surprisingly enough. So, at which point do we need to stop adding the powers of 2 in order to obtain a valid result? On my systems the limit appears to be 971 — try 970 and it sums to infinity.

BEGIN {for (i = 1023; i >= 971; --i) sum += 2^i; print sum}

This^^^ prints 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.

The value has a surprising property in awk: Whatever you add to it, up to a certain number, does not change it any more. (Try to print (e.g.) sum + 3.) Incrementing it (although it appears to remain unchanged, based on the print output) beyond a certain threshold yields infinity, eventually. This is definitely a bug.

As for the original sum above (21023 &plus; ⋯ &plus; 2971), it is still correct in awk. Things start to fall apart once you try to increase that sum further. For example (and surprisingly), this still yields the same result as above:

BEGIN {for (i = 1023; i >= 971; --i) sum += 2^i
       for (i = 969; i >= 0; --i) sum += 2^i
       print sum}

Checking both sums with Python is easy:

sum = 0

for i in range(971, 1024):
  sum += 2**i
print(sum)  # awk gets this right

for i in range(0, 970):
  sum += 2**i
print(sum)  # awk without -M gets this wrong

All in all, I think I will be setting -M in awk all the time from now on!

like image 28
Andrej Podzimek Avatar answered Sep 23 '22 08:09

Andrej Podzimek