Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get positions for NAs only in the "middle" of a matrix column

Tags:

r

I want to obtain an index that refers to the positions of NA values in a matrix where the index is true if a given cell is NA and there is at least one non-NA value before and after it in the column. For example, given the following matrix

     [,1] [,2] [,3] [,4]
[1,]   NA    1   NA    1
[2,]    1   NA   NA    2
[3,]   NA    2   NA    3

the only value of the index that comes back TRUE should be [2,2].

Is there a compact expression for what I want to do? If I had to I could loop through columns and use something like min(which(!is.na(x[,i]))) to find the first non-NA value in each column, and then set all values before that to FALSE (and the same for all values after the max). This way I would not select leading and trailing NA values. But this seems a bit messy, so I'm wondering if there is a cleaner expression that does this without loops.

EDIT To be valid an NA value only needs to have a non-NA value before and after it somewhere within the column, but not necessarily adjacent to it. For instance, if a column was defined by c(NA, 3, NA, NA, NA, 4, NA), the NA's I want to find would be the ones at positions 3, 4, and 5, as these are enclosed by non-NA values.

like image 488
Abiel Avatar asked Jan 28 '11 21:01

Abiel


2 Answers

Haven't tested this very thoroughly, but it does work on the test case:

z <- matrix(c(NA,1,NA,1,NA,2,NA,NA,NA,1,2,3),ncol=4)
isNA <- is.na(z)
# Vertical index which increments at non-NA entries, counting top-to-bottom:
nonNA_idx.tb <- apply(!isNA, 2, cumsum)
# Vertical index which increments at non-NA entries, counting bottom-to-top:
nonNA_idx.bt <- apply(!isNA, 2, function(x) { rev(cumsum(rev(x))) })
which(isNA & nonNA_idx.tb>0 & nonNA_idx.bt>0, arr.ind=TRUE)

(PS -- I think it's pretty cute, but I'm biased)

like image 166
Ben Bolker Avatar answered Nov 14 '22 21:11

Ben Bolker


m <- matrix(c(NA, 1, NA, 1, NA, 2, NA, NA, NA, 1, 2, 3), ncol= 4)

matmain <- is.na(m)
matprev <- rbind(FALSE, head(!matmain, -1))
matnext <- rbind(tail(!matmain, -1), FALSE)

which(matmain & (matprev | matnext), arr.ind = TRUE)

I interpreted the question slightly differently. When you say before and after in the column, do you mean immediately before and after, or anywhere before and after? With the following test matrix, we have [2,1] [3,1] and [2,2], but what about [2,3]?

m <- matrix(c(1, NA, NA, 5, 1, NA, 3, 5, 4, NA, NA, NA, 1, 2, 3, 5), ncol= 4)
like image 40
J. Win. Avatar answered Nov 14 '22 20:11

J. Win.