Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum integer in Perl

Set $i=0 and do ++$i while it increases. Which number we would reach?

Note that it may be not the same as maximum integer in Perl (as asked in the title), because there may be gaps between adjacent integers which are greater than 1.

like image 908
porton Avatar asked Dec 01 '22 01:12

porton


1 Answers

"Integer" can refer to a family of data types (int16_t, uint32_t, etc). There's no gap in the numbers these can represent.

"Integer" can also refer to numbers without a fractional component, regardless of the type of the variable used to store it. ++ will seamlessly transition between data types, so this is what's relevant to this question.

Floating point numbers can store integers in this sense, and it's possible to store very large numbers as floats without being able to add one to them. The reason for this is that floating pointer numbers are stored using the following form:

[+/-]1._____..._____ * 2**____

For example, let's say the mantissa of your floats can store 52 bits after the decimal, and you want to add 1 to 2**53.

     __52 bits__
    /           \
  1.00000...00000  * 2**53    Large power of two
+ 1.00000...00000  * 2**0     1
--------------------------
  1.00000...00000  * 2**53
+ 0.00000...000001 * 2**53    Normalized exponents
--------------------------
  1.00000...00000  * 2**53
+ 0.00000...00000  * 2**53    What we really get due to limited number of bits
--------------------------
  1.00000...00000  * 2**53    Original large power of two

So it is possible to hit a gap when using floating point numbers. However, you started with a number stored as signed integer.

$ perl -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
   $i = 0;
   $sv = svref_2object(\$i);
   print $sv->FLAGS & SVf_NOK    ? "NV\n"   # Float
      :  $sv->FLAGS & SVf_IVisUV ? "UV\n"   # Unsigned int
      :                            "IV\n";  # Signed int
'
IV

++$i will leave the number as a signed integer value ("IV") until it cannot anymore. At that point, it will start using an unsigned integer values ("UV").

$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
   $i = hex("7F".("FF"x($Config{ivsize}-2))."FD");
   $sv = svref_2object(\$i);
   for (1..4) {
      ++$i;
      printf $sv->FLAGS & SVf_NOK    ? "NV %.0f\n"
         :   $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
         :                             "IV %d\n", $i;
   }
'
IV 2147483646
IV 2147483647            <-- 2**31 - 1  Largest IV
UV 2147483648
UV 2147483649

or

IV 9223372036854775806
IV 9223372036854775807   <-- 2**63 - 1  Largest IV
UV 9223372036854775808
UV 9223372036854775809

Still no gap because no floating point numbers have been used yet. But Perl will eventually use floating point numbers ("NV") because they have a far larger range than integers. ++$i will switch to using a floating point number when it runs out of unsigned integers.

When that happens depends on your build of Perl. Not all builds of Perl have the same integer and floating point number sizes.

On one machine:

$ perl -V:[in]vsize
ivsize='4';   # 32-bit integers
nvsize='8';   # 64-bit floats

On another:

$ perl -V:[in]vsize
ivsize='8';   # 64-bit integers
nvsize='8';   # 64-bit floats

On a system where nvsize is larger than ivsize

On these systems, the first gap will happen above the largest unsigned integer. If your system uses IEEE double-precision floats, your floats have 53-bit of precision. They can represent without loss all integers from -253 to 253 (inclusive). ++ will fail to increment beyond that.

$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
   $i = eval($Config{nv_overflows_integers_at}) - 3;
   $sv = svref_2object(\$i);
   for (1..4) {
      ++$i;
      printf $sv->FLAGS & SVf_NOK    ? "NV %.0f\n"
         :   $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
         :                             "IV %d\n", $i;
   }
'
NV 9007199254740990
NV 9007199254740991
NV 9007199254740992   <-- 2**53      Requires 1 bit of precision as a float
NV 9007199254740992   <-- 2**53 + 1  Requires 54 bits of precision as a float
                                        but only 53 are available.

On a system where nvsize is no larger than ivsize

On these systems, the first gap will happen before the largest unsigned integer. Switching to floating pointer numbers will allow you to go one further (a large power of two), but that's it. ++ will fail to increment beyond the largest unsigned integer + 1.

$ perl -MConfig -MB=svref_2object,SVf_IVisUV,SVf_NOK -e'
   $i = hex(("FF"x($Config{ivsize}-1))."FD");
   $sv = svref_2object(\$i);
   for (1..4) {
      ++$i;
      printf $sv->FLAGS & SVf_NOK    ? "NV %.0f\n"
         :   $sv->FLAGS & SVf_IVisUV ? "UV %u\n"
         :                             "IV %d\n", $i;
   }
'
UV 18446744073709551614
UV 18446744073709551615   <-- 2**64 - 1  Largest UV
NV 18446744073709551616   <-- 2**64      Requires 1 bit of precision as a float
NV 18446744073709551616   <-- 2**64 + 1  Requires 65 bits of precision as a float
                                            but only 53 are available.
like image 100
ikegami Avatar answered Dec 04 '22 10:12

ikegami