Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove rows in R matrix where all data is NA [duplicate]

Tags:

r

Possible Duplicate:
Removing empty rows of a data file in R

How would I remove rows from a matrix or data frame where all elements in the row are NA?

So to get from this:

     [,1] [,2] [,3] [1,]    1    6   11 [2,]   NA   NA   NA [3,]    3    8   13 [4,]    4   NA   NA [5,]    5   10   NA 

to this:

     [,1] [,2] [,3] [1,]    1    6   11 [2,]    3    8   13 [3,]    4   NA   NA [4,]    5   10   NA 

Because the problem with na.omit is that it removes rows with any NAs and so would give me this:

     [,1] [,2] [,3] [1,]    1    6   11 [2,]    3    8   13 

The best I have been able to do so far is use the apply() function:

> x[apply(x, 1, function(y) !all(is.na(y))),]      [,1] [,2] [,3] [1,]    1    6   11 [2,]    3    8   13 [3,]    4   NA   NA [4,]    5   10   NA 

but this seems quite convoluted (is there something simpler that I am missing?)....

Thanks.

like image 472
Thomas Browne Avatar asked Jun 24 '11 17:06

Thomas Browne


People also ask

How do I remove rows with all NA values in R?

To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).

How do I remove multiple rows from a matrix in R?

To remove the multiple rows in R, use the subsetting and pass the vector with multiple elements. The elements are the row index, which we need to remove. To remove the second and third-row in R, use -c(2, 3), and it will return the data frame without the second and third row.

How do you get rid of a row in a matrix in R?

We use the matrix() function and pass the parameter values of its shape. nrow = 3 represents 3 rows, while ncol = 2 represents 2 columns. Line 4: Using the c() function, we remove the first row and first column [-c(1), -c(1)] from the matrix.


Video Answer


2 Answers

Solutions using rowSums() generally outperform apply() ones:

m <- structure(c( 1,  NA,  3,  4,  5,                    6,  NA,  8, NA, 10,                   11,  NA, 13, NA, NA),                 .Dim = c(5L, 3L))  m[rowSums(is.na(m)) != ncol(m), ]       [,1] [,2] [,3] [1,]    1    6   11 [2,]    3    8   13 [3,]    4   NA   NA [4,]    5   10   NA 
like image 166
IRTFM Avatar answered Sep 22 '22 15:09

IRTFM


Sweep a test for all(is.na()) across rows, and remove where true. Something like this (untested as you provided no code to generate your data -- dput() is your friend):

 R> ind <- apply(X, 1, function(x) all(is.na(x)))  R> X <- X[ !ind, ] 
like image 36
Dirk Eddelbuettel Avatar answered Sep 20 '22 15:09

Dirk Eddelbuettel