Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter groups in dplyr that exclusively contain specific combinations of values

Tags:

r

dplyr

Given a table like:

  id value
1  1     a
2  2     a
3  2     b
4  2     c
5  3     c

I would like to filter for:

a) the ids that only have value a, i.e. id 1.

b) the ids that contain a and b jointly, i.e. id 2.

Data:

data.frame(id = c(1,2,2,2,3), value = c("a", "a", "b", "c", "c"))
like image 786
chopin_is_the_best Avatar asked Dec 14 '15 15:12

chopin_is_the_best


People also ask

How do I filter multiple values in R dplyr?

In this, first, pass your dataframe object to the filter function, then in the condition parameter write the column name in which you want to filter multiple values then put the %in% operator, and then pass a vector containing all the string values which you want in the result.

What is the difference between the Group_by and filter function?

group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". ungroup() removes grouping. The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions. To be retained, the row must produce a value of TRUE for all conditions.

Is filter a dplyr function?

Of course, dplyr has 'filter()' function to do such filtering, but there is even more. With dplyr you can do the kind of filtering, which could be hard to perform or complicated to construct with tools like SQL and traditional BI tools, in such a simple and more intuitive way.


1 Answers

Try

a)

df %>% group_by(id) %>% filter(all(value == "a"))

b)

df %>% group_by(id) %>% filter(all(c("a", "b") %in% value))
like image 187
talat Avatar answered Nov 15 '22 21:11

talat