Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove groups of observation with dplyr::filter()

For the following data

ds <- read.table(header = TRUE, text ="
id year attend
1 2007      1
1 2008      1
1 2009      1
1 2010      1
1 2011      1
8 2007      3
8 2008     NA
8 2009      3
8 2010     NA
8 2011      3
9 2007      2
9 2008      3
9 2009      3
9 2010      5
9 2011      5
10 2007     4
10 2008     4
10 2009     2
10 2010    NA
10 2011    NA
")
ds<- ds %>% dplyr::mutate(time=year-2000)
print(ds)

How would I write a dplyr::filter() command to keep only the ids that don't have a single NA? So only subjects with ids 1 and 9 should stay after the filter.

like image 225
andrey Avatar asked Jul 05 '14 05:07

andrey


2 Answers

Or you could use:

ds %>%
group_by(id) %>% 
filter(attend=all(!is.na(attend)))
#Source: local data frame [10 x 3]
#Groups: id

#  id year attend
#1   1 2007      1
#2   1 2008      1
#3   1 2009      1
#4   1 2010      1
#5   1 2011      1
#6   9 2007      2
#7   9 2008      3
#8   9 2009      3
#9   9 2010      5
#10  9 2011      5
like image 172
akrun Avatar answered Oct 20 '22 18:10

akrun


Use filter in conjunction with base::ave

ds %>% dplyr::filter(ave(!is.na(attend), id, FUN = all))

To obtain

    id year attend
 1   1 2007      1
 2   1 2008      1
 3   1 2009      1
 4   1 2010      1
 5   1 2011      1
 6   9 2007      2
 7   9 2008      3
 8   9 2009      3
 9   9 2010      5
 10  9 2011      5
like image 8
Robert Krzyzanowski Avatar answered Oct 20 '22 18:10

Robert Krzyzanowski