Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dropping all left NAs in a dataframe and left shifting the cleaned rows

Tags:

r

na

tail

I have the following dataframe dat, which presents a row-specific number of NAs at the beginning of some of its rows:

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat

#  V1 V2 V3 V4 V5 V6 V7 V8
#  NA NA  1  3  5 NA NA NA
#  NA  1  2  3  6  7  8 NA
#   1 NA  2  3  4  5  6 NA

My aim is to delete all the NAs at the beginning of each row and to left shift the row values (adding NAs at the end of the shifted rows accordingly, in order to keep their length constant).

The following code works as expected:

for (i in 1:nrow(dat)) {

    if (is.na(dat[i,1])==TRUE) {
        dat1 <- dat[i, min(which(!is.na(dat[i,]))):length(dat[i,])]
        dat[i,]  <- data.frame( dat1, t(rep(NA, ncol(dat)-length(dat1))) )
    }

}

dat

returning:

#  V1 V2 V3 V4 V5 V6 V7 V8
#   1  3  5 NA NA NA NA NA
#   1  2  3  6  7  8 NA NA
#   1 NA  2  3  4  5  6 NA

I was wondering whther there is a more direct way to do so without using a for-loop and by using the tail function.

With respect to this last point, by using min(which(!is.na(dat[1,]))) the result is 3, as expected. But then if I type tail(dat[1,],min(which(!is.na(dat[1,])))) the result is the same initial row, and I don't understand why..

Thank you very much for anu suggestion.

like image 809
Stefano Lombardi Avatar asked Dec 25 '22 09:12

Stefano Lombardi


2 Answers

if you just want all NA's to be pushed to the end, you could try

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat[3,2] <- NA
> dat
  V1 V2 V3 V4 V5 V6 V7 V8
1 NA NA  1  3  5 NA NA NA
2 NA  1  2  3  6  7  8 NA
3  1 NA  3  4  5  6  7 NA
dat.new<-do.call(rbind,lapply(1:nrow(dat),function(x) t(matrix(dat[x,order(is.na(dat[x,]))])) ))
colnames(dat.new)<-colnames(dat)
> dat.new
     V1 V2 V3 V4 V5 V6 V7 V8
[1,] 1  3  5  NA NA NA NA NA
[2,] 1  2  3  6  7  8  NA NA
[3,] 1  3  4  5  6  7  NA NA
like image 115
Silence Dogood Avatar answered Jan 30 '23 23:01

Silence Dogood


I don't think you can do this without a loop.

dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat[3,2] <- NA

#   V1 V2 V3 V4 V5 V6 V7 V8
# 1 NA NA  1  3  5 NA NA NA
# 2 NA  1  2  3  6  7  8 NA
# 3  1 NA  3  4  5  6  7 NA

t(apply(dat, 1, function(x) {
  if (is.na(x[1])) {
    y <- x[-seq_len(which.min(is.na(x))-1)]
    length(y) <- length(x)
    y
  } else x
}))

#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,]    1    3    5   NA   NA   NA   NA   NA
#[2,]    1    2    3    6    7    8   NA   NA
#[3,]    1   NA    3    4    5    6    7   NA

Then turn the matrix into a data.frame if you must.

like image 29
Roland Avatar answered Jan 30 '23 23:01

Roland