Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr group_by dynamic cols

Tags:

r

dplyr

rlang

What is the consensus on the best way to group_by when the group_by is being fed a variable? Consider the following simple function:

library(dplyr)

myFunction <- function(df, 
                        col_name) {

    out <- 
      group_by(col_name) %>%
      summarize(mean = mean(mpg))

    return(out)
  }

  myFunction(mtcars, col_name = c('cyl', 'am'))

The call to this function returns and error stating the column doesn't exist. I understand why but am not sure the best approach to get around this. I can make if work if only have one grouping variable by doing:

group_by(!!as.name(col_name)) 

This however doesn't work if col_name is a vector > 1

Any ideas?

like image 887
user1658170 Avatar asked Jan 25 '26 00:01

user1658170


1 Answers

You can try:

myFunction <- function(df, col_name) {
 out <- df %>%
  group_by_at(vars(one_of(col_name))) %>%
  summarize(mean = mean(mpg))

 return(out)
}

myFunction(mtcars, col_name = c("cyl", "am"))

    cyl    am  mean
  <dbl> <dbl> <dbl>
1     4     0  22.9
2     4     1  28.1
3     6     0  19.1
4     6     1  20.6
5     8     0  15.0
6     8     1  15.4
like image 125
tmfmnk Avatar answered Jan 27 '26 16:01

tmfmnk



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!