Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gotchas with logical indexing and "which" in R

Tags:

indexing

r

Are there any circumstances in R where the use of which for indexing can't be mixed with logical indexing in R? I seem to recall coming across a gotcha with these two a few months ago -- something with the flavor of R maintaining some internal notion of row number and this not playing nicely with the use of "which" after I'd used logical indexing elsewhere to drop some rows.

Is this a known phenomenon, or did I dream the whole thing?

like image 777
dwh Avatar asked Jun 14 '11 00:06

dwh


2 Answers

Be aware that NA's and other such entries can confuse the situation. Following @mdsumner's example:

> x <- c(1:10,NA,NaN,Inf)
> x > 5
 [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    NA    NA
[13]  TRUE
> x[x > 5]
[1]   6   7   8   9  10  NA  NA Inf
> x[which(x > 5)]
[1]   6   7   8   9  10 Inf
like image 156
nullglob Avatar answered Oct 26 '22 15:10

nullglob


which returns index numbers to select elements from a vector or slices from a matrix/array or data.frame, and these cannot be "mixed" with logical vectors

Consider the logical vector for all numbers > 5 in this vector:

x <- 1:10
x > 5
[1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

That is a 10 element logical vector, but the which() equivalent is of length 5:

which(x > 5) [1] 6 7 8 9 10

There's nothing complicated about not being able to mix these things, they just don't go together. The first implicitly discards the first five elements, and keeps the last five by virtue of position matching between the data and the logical vector

x[x > 5]

and the second is explicitly only selecting the last five elements

x[which(x > 5)]

Same result, but the argument to the "[" operator is quite different in each case. This applies whether the selected elements are singleton values in a vector or rows in a data.frame.

like image 44
mdsumner Avatar answered Oct 26 '22 13:10

mdsumner