Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing empty rows of a data file in R

Tags:

r

I have a dataset with empty rows. I would like to remove them:

myData<-myData[-which(apply(myData,1,function(x)all(is.na(x)))),] 

It works OK. But now I would like to add a column in my data and initialize the first value:

myData$newCol[1] <- -999  Error in `$<-.data.frame`(`*tmp*`, "newCol", value = -999) :    replacement has 1 rows, data has 0 

Unfortunately it doesn't work and I don't really understand why and I can't solve this. It worked when I removed one line at a time using:

TgData = TgData[2:nrow(TgData),] 

Or anything similar.

It also works when I used only the first 13.000 rows.

But it doesn't work with my actual data, with 32.000 rows.

What did I do wrong? It seems to make no sense to me.

like image 207
Antonin Avatar asked Jun 22 '11 08:06

Antonin


1 Answers

I assume you want to remove rows that are all NAs. Then, you can do the following :

data <- rbind(c(1,2,3), c(1, NA, 4), c(4,6,7), c(NA, NA, NA), c(4, 8, NA)) # sample data data      [,1] [,2] [,3] [1,]    1    2    3 [2,]    1   NA    4 [3,]    4    6    7 [4,]   NA   NA   NA [5,]    4    8   NA  data[rowSums(is.na(data)) != ncol(data),]      [,1] [,2] [,3] [1,]    1    2    3 [2,]    1   NA    4 [3,]    4    6    7 [4,]    4    8   NA 

If you want to remove rows that have at least one NA, just change the condition :

data[rowSums(is.na(data)) == 0,]      [,1] [,2] [,3] [1,]    1    2    3 [2,]    4    6    7 
like image 176
Wookai Avatar answered Oct 02 '22 14:10

Wookai