In the console, go ahead and try
> sum(sapply(1:99999, function(x) { x != as.character(x) }))
0
For all of values 1 through 99999, "1" == 1
, "2" == 2
, ..., 99999 == "99999"
are TRUE
. However,
> 100000 == "100000"
FALSE
Why does R have this quirky behavior, and is this a bug? What would be a workaround to, e.g., check if every element in an atomic character vector is in fact numeric? Right now I was trying to check whether x == as.numeric(x)
for each x
, but that fails on certain datasets due to the above problem!
Have a look at as.character(100000)
. Its value is not equal to "100000"
(have a look for yourself), and R is essentially just telling you so.
as.character(100000)
# [1] "1e+05"
Here, from ?Comparison
, are R's rules for applying relational operators to values of different types:
If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
Those rules mean that when you test whether 1=="1"
, say, R first converts the numeric value on the LHS to a character string, and then tests for equality of the character strings on the LHS and RHS. In some cases those will be equal, but in other cases they will not. Which cases produce inequality will be dependent on the current settings of options("scipen")
and options("digits")
So, when you type 100000=="100000"
, it is as if you were actually performing the following test. (Note that internally, R may well/probably does use something different than as.character()
to perform the conversion):
as.character(100000)=="100000"
# [1] FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With