What is the consensus on the best way to group_by when the group_by is being fed a variable? Consider the following simple function:
library(dplyr)
myFunction <- function(df,
col_name) {
out <-
group_by(col_name) %>%
summarize(mean = mean(mpg))
return(out)
}
myFunction(mtcars, col_name = c('cyl', 'am'))
The call to this function returns and error stating the column doesn't exist. I understand why but am not sure the best approach to get around this. I can make if work if only have one grouping variable by doing:
group_by(!!as.name(col_name))
This however doesn't work if col_name is a vector > 1
Any ideas?
You can try:
myFunction <- function(df, col_name) {
out <- df %>%
group_by_at(vars(one_of(col_name))) %>%
summarize(mean = mean(mpg))
return(out)
}
myFunction(mtcars, col_name = c("cyl", "am"))
cyl am mean
<dbl> <dbl> <dbl>
1 4 0 22.9
2 4 1 28.1
3 6 0 19.1
4 6 1 20.6
5 8 0 15.0
6 8 1 15.4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With