Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does type.convert not convert large "integers" that are stored as numeric to integer?

Tags:

r

class(type.convert(as.numeric(1e3)))
# [1] "integer"
class(type.convert(as.numeric(1e4)))
# [1] "integer"
class(type.convert(as.numeric(1e5)))
# [1] "numeric"
class(type.convert(as.numeric(1e6)))
# [1] "numeric"

Why are the larger ones not converted to integers? There is still a lot to go to:

.Machine$integer.max
# [1] 2147483647

Maybe the answer can be found in the C source of typeconvert on GitHub? Unfortunately I am quite unfamiliar with C.

like image 203
minem Avatar asked Dec 05 '18 16:12

minem


Video Answer


1 Answers

Ok, this is less strange than it appears. Let's give a look to the source code of utils:::type.convert.default:

function (x, na.strings = "NA", as.is = FALSE, dec = ".", numerals = c("allow.loss", 
    "warn.loss", "no.loss"), ...) 
{
    if (is.array(x)) 
        storage.mode(x) <- "character"
    else x <- as.character(x)
    .External2(C_typeconvert, x, na.strings, as.is, dec, match.arg(numerals))
}

The important part is x <- as.character(x): no matter what the input is, it gets coerced to a character before trying to convert its type (this is quite peculiar, since a numeric or integer vector might be returned as is, without further processing in my opinion). How this is done, it depends on the nature and value of x. For instance:

#numeric value
as.character(100000)
#[1] "1e+05"
#integer value
as.character(100000L)
#[1] "100000"

When it tries to type.convert, "100000" is a suitable integer string, while "1e+05" is not, and this explain the different behaviour. Consider that as.character depends also on the scipen option. If set sufficiently high, the as.character doesn't produce a scientific notation, but a number which may be considered integer by type.convert.

options(scipen=999)
options("scipen")
as.character(100000)
#[1] "100000"
like image 151
nicola Avatar answered Oct 26 '22 07:10

nicola