When entering 1e9999999999999999999999999999999
into R, R hangs and will not respond - requiring it to be terminated.
It seems to happen across 3 different computers, OSes (Windows 7 and Ubuntu). It happens in RStudio, RGui and RScript.
Here's some code to generate the number more easily:
boom <- paste(c("1e", rep(9, 31)), collapse="")
eval(parse(text=boom))
Now clearly this isn't a practical problem. I have no need to use numbers of this magnitude. It's just a question of curiosity.
Curiously, if you try 1e9999999999999999999999999999998
or 1e10000000000000000000000000000000
(add or subtract one from the power), you get Inf
and 0
respectively. This number is clearly some kind of boundary, but between what and why here?
I considered that it might be:
EDIT: As of 2015-09-15 at the latest, this no longer causes R to hang. They must have patched it.
This looks like an extreme case in the parser. The XeY
format is described in Section 10.3.1: Literal Constants of the R Language Definition and points to ?NumericConstants
for "up-to-date information on the currently accepted formats".
The problem seems to be how the parser handles the exponent. The numeric constant is handled by NumericValue
(line 4361 of main/gram.c
), which calls mkFloat
(line 4124 of main/gram.c
), which calls R_atof
(line 1584 of main/util.c
), which calls R_strtod4
(line 1461 of main/util.c
). (All as of revision 60052.)
Line 1464 of main/utils.c
shows expn
declared as int
and it will overflow at line 1551 if the exponent is too large. The signed integer overflow causes undefined behavior.
For example, the code below produces values for exponents < 308 or so and Inf
for exponents > 308.
const <- paste0("1e",2^(1:31)-2)
for(n in const) print(eval(parse(text=n)))
You can see the undefined behavior for exponents > 2^31 (R hangs for an exponent = 2^31):
const <- paste0("1e",2^(31:61)+1)
for(n in const) print(eval(parse(text=n)))
I doubt this will get any attention from R-core because R can only store numeric values between about 2e-308 to 2e+308 (see ?double
) and this number is way beyond that.
This is interesting, but I think R has systemic problems with parsing numbers that have very large exponents:
> 1e10000000000000000000000000000000
[1] 0
> 1e1000000000000000000000000000000
[1] Inf
> 1e100000000000000000000
[1] Inf
> 1e10000000000000000000
[1] 0
> 1e1000
[1] Inf
> 1e100
[1] 1e+100
There we go, finally something reasonable. According to this output and Joshua Ulrich's comment below, R appears to support representing numbers up to about 2e308 and parsing numbers with exponents up to about +2*10^9, but it cannot represent them. After that, there is undefined behavior apparently due to overflow.
R might use sometimes bignums. Perhaps 1e9999999999999999999999999999999
is some threshold, or perhaps the parsing routines have a limited buffer for reading the exponent. Your observation would be consistent with a 32 char (null-terminated) buffer for the exponent.
I'll rather ask that question on forums or mailing list specific to R, which are rumored to be friendly.
Alternatively, since R is free software, you could investigate its source code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With