Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove completely NA rows in r

Tags:

dataframe

r

na

Been searching for this and even though it should be simple I only found solutions for complete cases or selecting subsets of columns to then omit their NAs. In my case I've got a data frame like this:

   vp01ob__0 vp01ob__1 vp01ob__2 vp01ob__3 vp01ob__4 vp01ob__5 vp01ob__6 vp01ob__7 vp01ob__8
   <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>    
 1 NA        NA        NA        NA        NA        NA        NA        NA        NA       
 2 NA        NA        NA        NA        NA        NA        NA        NA        NA       
 3 a         NA        NA        NA        NA        NA        NA        NA        NA       
 4 NA        NA        NA        NA        NA        NA        NA        NA        NA 
 5 NA        NA        NA        NA        NA        NA        NA        NA        NA       
 6 NA        NA        NA        NA        NA        NA        NA        NA        NA       
 7 NA        b         NA        NA        NA        NA        NA        NA        NA  

It's a very sparse dataframe, and so I want to keep just the rows that have some information, like this:

   vp01ob__0 vp01ob__1 vp01ob__2 vp01ob__3 vp01ob__4 vp01ob__5 vp01ob__6 vp01ob__7 vp01ob__8
   <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>          
 3 a         NA        NA        NA        NA        NA        NA        NA        NA          
 7 NA        b         NA        NA        NA        NA        NA        NA        NA  

Complete cases drops everything and I couldn't find a way to use filter_all or na.omit(). Any help would be appreciated.

Thanks!

like image 593
Juan C Avatar asked Dec 17 '22 11:12

Juan C


2 Answers

We could either use if_all

library(dplyr)
df1 %>%
     filter(!if_all(everything(), is.na))

-output

#vp01ob__0 vp01ob__1 vp01ob__2 vp01ob__3 vp01ob__4 vp01ob__5 vp01ob__6 vp01ob__7 vp01ob__8
#3         a      <NA>        NA        NA        NA        NA        NA        NA        NA
#7      <NA>         b        NA        NA        NA        NA        NA        NA        NA

Or with if_any

df1 %>%
    filter(if_any(everything(), ~ !is.na(.)))

-output

#vp01ob__0 vp01ob__1 vp01ob__2 vp01ob__3 vp01ob__4 vp01ob__5 vp01ob__6 vp01ob__7 vp01ob__8
#3         a      <NA>        NA        NA        NA        NA        NA        NA        NA
#7      <NA>         b        NA        NA        NA        NA        NA        NA        NA

data

df1 <- structure(list(vp01ob__0 = c(NA, NA, "a", NA, NA, NA, NA), vp01ob__1 = c(NA, 
NA, NA, NA, NA, NA, "b"), vp01ob__2 = c(NA, NA, NA, NA, NA, NA, 
NA), vp01ob__3 = c(NA, NA, NA, NA, NA, NA, NA), vp01ob__4 = c(NA, 
NA, NA, NA, NA, NA, NA), vp01ob__5 = c(NA, NA, NA, NA, NA, NA, 
NA), vp01ob__6 = c(NA, NA, NA, NA, NA, NA, NA), vp01ob__7 = c(NA, 
NA, NA, NA, NA, NA, NA), vp01ob__8 = c(NA, NA, NA, NA, NA, NA, 
NA)), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6", "7"))
like image 132
akrun Avatar answered Dec 20 '22 02:12

akrun


Using rowSums():

require(dplyr)
df %>% filter(rowSums(!is.na(df)) > 0)

Base R:

df[rowSums(!is.na(df)) > 0,]
like image 23
VitaminB16 Avatar answered Dec 20 '22 02:12

VitaminB16