Given a table like:
id value
1 1 a
2 2 a
3 2 b
4 2 c
5 3 c
I would like to filter for:
a) the ids that only have value a, i.e. id 1.
b) the ids that contain a and b jointly, i.e. id 2.
Data:
data.frame(id = c(1,2,2,2,3), value = c("a", "a", "b", "c", "c"))
In this, first, pass your dataframe object to the filter function, then in the condition parameter write the column name in which you want to filter multiple values then put the %in% operator, and then pass a vector containing all the string values which you want in the result.
group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". ungroup() removes grouping. The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions. To be retained, the row must produce a value of TRUE for all conditions.
Of course, dplyr has 'filter()' function to do such filtering, but there is even more. With dplyr you can do the kind of filtering, which could be hard to perform or complicated to construct with tools like SQL and traditional BI tools, in such a simple and more intuitive way.
Try
a)
df %>% group_by(id) %>% filter(all(value == "a"))
b)
df %>% group_by(id) %>% filter(all(c("a", "b") %in% value))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With