Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use column index instead of name in group_by

Tags:

r

dplyr

I want to summarize a dataframe with dplyr, like so:

> test <-data.frame(ID = c("A", "A", "B", "B"), val = c(1:4))
> test %>% group_by(ID) %>% summarize(av = mean(val))
# A tibble: 2 x 2
      ID    av
  <fctr> <dbl>
1      A   1.5
2      B   3.5

But suppose that instead of grouping by the column called "ID" I wish to group by the first column, regardless of its name. Is there a simple way to do that?

I've tried a few naive approaches (group_by(1), group_by(.[1]), group_by(., .[1]), group_by(names(.)[1]) to no avail. I'm only just beginning to use tidyverse packages so I may be missing something obvious.

This question is very similar, but it's about mutate and I wasn't able to generalize it to my problem. This question is also similar, but the accepted answer is to use a different package, and I'm trying to stick with dplyr.

like image 894
Joe Avatar asked Sep 26 '17 22:09

Joe


1 Answers

You can use the across functionality as of version 1.0.0:

library(dplyr)
test %>% 
  group_by(across(1)) %>% 
  summarise(av = mean(val))
## A tibble: 2 x 2
#  ID       av
#  <fct> <dbl>
#1 A       1.5
#2 B       3.5
like image 81
Ian Campbell Avatar answered Oct 02 '22 21:10

Ian Campbell