I'm writing functions that take in a data.frame
and then do some operations. I need to add and subtract items from the group_by
criteria in order to get where I want to go.
If I want to add a group_by
criteria to a df, that's pretty easy:
library(tidyverse)
set.seed(42)
n <- 10
input <- data.frame(a = 'a',
b = 'b' ,
vals = 1
)
input %>%
group_by(a) ->
grouped
grouped
#> # A tibble: 1 x 3
#> # Groups: a [1]
#> a b vals
#> <fct> <fct> <dbl>
#> 1 a b 1.
## add a group:
grouped %>%
group_by(b, add=TRUE)
#> # A tibble: 1 x 3
#> # Groups: a, b [1]
#> a b vals
#> <fct> <fct> <dbl>
#> 1 a b 1.
## drop a group?
But how do I programmatically drop the grouping by b
which I added, yet keep all other groupings the same?
Groupby Function in R – group_by is used to group the dataframe in R. Dplyr package in R is provided with group_by() function which groups the dataframe by multiple columns with mean, sum and other functions like count, maximum and minimum.
Most data operations are done on groups defined by variables. group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". ungroup() removes grouping.
GROUP BY enables you to use aggregate functions on groups of data returned from a query. FILTER is a modifier used on an aggregate function to limit the values used in an aggregation. All the columns in the select statement that aren't aggregated should be specified in a GROUP BY clause in the query.
The group_by() method is used to group the data contained in the data frame based on the columns specified as arguments to the function call.
Here's an approach that uses tidyeval so that bare column names can be used as the function arguments. I'm not sure if it makes sense to convert the bare column names to text (as I've done below) or if there's a more elegant way to work directly with the bare column names.
drop_groups = function(data, ...) {
groups = map_chr(groups(data), rlang::quo_text)
drop = map_chr(quos(...), rlang::quo_text)
if(any(!drop %in% groups)) {
warning(paste("Input data frame is not grouped by the following groups:",
paste(drop[!drop %in% groups], collapse=", ")))
}
data %>% group_by_at(setdiff(groups, drop))
}
d = mtcars %>% group_by(cyl, vs, am)
groups(d %>% drop_groups(vs, cyl))
[[1]] am
groups(d %>% drop_groups(a, vs, b, c))
[[1]] cyl [[2]] am Warning message: In drop_groups(., a, vs, b, c) : Input data frame is not grouped by the following groups: a, b, c
UPDATE: The approach below works directly with quosured column names, without converting them to strings. I'm not sure which approach is "preferred" in the tidyeval paradigm, or whether there is yet another, more desirable method.
drop_groups2 = function(data, ...) {
groups = map(groups(data), quo)
drop = quos(...)
if(any(!drop %in% groups)) {
warning(paste("Input data frame is not grouped by the following groups:",
paste(drop[!drop %in% groups], collapse=", ")))
}
data %>% group_by(!!!setdiff(groups, drop))
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With