Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass a variable name in group_by

Tags:

I can calculate the rank of the values (val) in my dataframe df within the group name1 with the code:

res  <- df %>% arrange(val) %>% group_by(name1) %>% mutate(RANK=row_number())  

Instead of writing the column "name1" in the code, I want to pass it as variable, eg crit = "name1". However, the code below does not work since crit1 is assumed to be the column name instead of a variable name.

res  <- df %>% arrange(val) %>% group_by(crit1) %>% mutate(RANK=row_number())  

How can I pass crit1 in the code?

Thanks, Tom

like image 450
TomDriftwood Avatar asked Jul 06 '16 09:07

TomDriftwood


People also ask

What does the Group_by function do in R?

Groupby Function in R – group_by is used to group the dataframe in R. Dplyr package in R is provided with group_by() function which groups the dataframe by multiple columns with mean, sum and other functions like count, maximum and minimum.

Can you group by multiple columns in Dplyr?

The group_by() method is used to group the data contained in the data frame based on the columns specified as arguments to the function call.


1 Answers

We can use group_by_

library(dplyr) df %>%     arrange(val) %>%      group_by_(.dots=crit1) %>%     mutate(RANK=row_number())  #Source: local data frame [10 x 4] #Groups: name1, name2 [7]  #            val name1 name2  RANK #          <dbl> <chr> <chr> <int> #1  -0.848370044     b     c     1 #2  -0.583627199     a     a     1 #3  -0.545880758     a     a     2 #4  -0.466495124     b     b     1 #5   0.002311942     a     c     1 #6   0.266021979     c     a     1 #7   0.419623149     c     b     1 #8   0.444585270     a     c     2 #9   0.536585304     b     a     1 1#0  0.847460017     a     c     3 

Update

group_by_ is deprecated in the recent versions (now using dplyr version - 0.8.1), so we can use group_by_at which takes a vector of strings as input variables

df %>%   arrange(val) %>%    group_by_at(crit1) %>%   mutate(RANK=row_number()) 

Or another option is to convert to symbols (syms from rlang) and evaluate (!!!)

df %>%    arrange(val) %>%     group_by(!!! rlang::syms(crit1)) %>%     mutate(RANK = row_number()) 

data

set.seed(24) df <- data.frame(val = rnorm(10), name1= sample(letters[1:3], 10, replace=TRUE),           name2 = sample(letters[1:3], 10, replace=TRUE),   stringsAsFactors=FALSE)  crit1 <- c("name1", "name2") 
like image 159
akrun Avatar answered Sep 20 '22 21:09

akrun