I have a question regarding selecting specific values from a vector in R. More specifically, I want to select all integer values from a given variable in my dataset (I want to use these to subset my data). Here is an example:
x <- seq(0,10,1/3)
Now I want to select all the observations in the vector x with integer numbers. My first idea was to use the is.integer
command, but this does not work. I found a workaround solution using the following:
> x==as.integer(x)
[1] TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
FALSE TRUE FALSE FALSE TRUE
[17] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
FALSE TRUE FALSE FALSE TRUE
Now I can simply type
> which(x==as.integer(x))
[1] 1 4 7 10 13 16 19 22 25 28 31
and I get the expected result (and I can use this vector for subsetting my dataset). But isn't there a more direct way to select integer values?
This is a counter example to the suggestion to use modulo operators:
> x <- seq(1/3, 9 , 1/3)
> x[!x%%1]
[1] 1 3 4 9
> x
[1] 0.3333333 0.6666667 1.0000000 1.3333333 1.6666667 2.0000000
[7] 2.3333333 2.6666667 3.0000000 3.3333333 3.6666667 4.0000000
[13] 4.3333333 4.6666667 5.0000000 5.3333333 5.6666667 6.0000000
[19] 6.3333333 6.6666667 7.0000000 7.3333333 7.6666667 8.0000000
[25] 8.3333333 8.6666667 9.0000000
There are many examples of similar questions on SO about why not to make that assumption that integers will reliably result from typical operations on numeric values. The canonical warning is R-FAQ 7.31. On my device this is found in the R help page: 7.31 Why doesn't R think these numbers are equal?
. A more reliable approach would be:
> x[ abs(x-round(x) ) < 0.00000001 ]
[1] 1 2 3 4 5 6 7 8 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With