Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify NA's in sequence row-wise

Tags:

r

na

I want to fill NA values in a sequence, which is row-wise, based on a condition. Please see example below.

ID | Observation 1 | Observation 2 | Observation 3 | Observation 4 | Observation 5
 A         NA              0               1             NA             NA

The condition is:

  • all NA values before !NA values in the sequence should be left as NA;
  • but all NAs after !NA values in the sequence should be tagged ("remove")

In the example above, NA value in Observation 1 should remain NA. However, the NA values in Observations 4 and 5 should be changed to "Remove".

like image 913
Prometheus Avatar asked Jan 26 '26 16:01

Prometheus


1 Answers

You can define the function:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[length(k)]]] <- val
  r
}

Then, assuming that you have a data.frame like so:

r <- data.frame(ID=c('A','B'),obs1=c(NA,1),obs2=c(0,NA),obs3=c(1,2),obs4=c(NA,3),obs5=c(NA,NA))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1   NA   NA
##2  B    1   NA    2    3   NA

We can apply the function over the rows for all numeric columns of r:

r[,-1] <- t(apply(r[,-1],1,replace.na,999))    
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1   NA    2    3  999

This treats r[,-1] as a matrix and the output of apply fills a matrix, which by default is filled by columns. Therefore, we have to transpose the resulting matrix before replacing the columns back into r.

Another way to call replace.na is:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))

Here, we transpose the numeric columns of r first and make that a data.frame. This makes each row of r a column in the list of columns that is the resulting data frame. Then use lapply over these columns to apply replace.na and rbind the results.


If you want to flag all NA's after the first non-NA, then the function replace.na should be:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[1]]] <- val
  r
}

Applying it to the data:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1  999    2    3  999
like image 197
aichao Avatar answered Jan 28 '26 08:01

aichao



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!