Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group by aggregate dynamic column name matching

Tags:

r

aggregate

dplyr

Is it possible to group_by using regex match on column names using dplyr?

library(dplyr) # dplyr_0.5.0; R version 3.3.2 (2016-10-31)

# dummy data
set.seed(1)
df1 <-  sample_n(iris, 20) %>% 
  mutate(Sepal.Length = round(Sepal.Length),
         Sepal.Width = round(Sepal.Width))

Group by static version (looks/works fine, imagine if we have 10-20 columns):

df1 %>% 
  group_by(Sepal.Length, Sepal.Width) %>% 
  summarise(mySum = sum(Petal.Length))

Group by dynamic - "ugly" version:

df1 %>% 
  group_by_(.dots = colnames(df1)[ grepl("^Sepal", colnames(df1))]) %>% 
  summarise(mySum = sum(Petal.Length))

Ideally, something like this (doesn't work, as starts_with returns indices):

df1 %>% 
  group_by(starts_with("Sepal")) %>% 
  summarise(mySum = sum(Petal.Length))
Error in eval(expr, envir, enclos) : 
   wrong result size (0), expected 20 or 1

Expected output:

# Source: local data frame [6 x 3]
# Groups: Sepal.Length [?]
# 
#   Sepal.Length Sepal.Width mySum
#          <dbl>       <dbl> <dbl>
# 1            4           3   1.4
# 2            5           3  10.9
# 3            6           2   4.0
# 4            6           3  43.7
# 5            7           3  15.7
# 6            8           4   6.4

Note: sounds very much like a duplicated post, kindly link the relevant posts if any.

like image 756
zx8754 Avatar asked Apr 05 '17 10:04

zx8754


2 Answers

This feature will be implemented in future release, reference GitHub issue #2619:

Solution would be to use group_by_at function:

df1 %>%
  group_by_at(vars(starts_with("Sepal"))) %>% 
  summarise(mySum = sum(Petal.Length))

Edit: This is now implemented in dplyr_0.7.1

like image 142
zx8754 Avatar answered Sep 23 '22 06:09

zx8754


if you just want to keep it with dplyr functions, you can try:

df1 %>%
  group_by_(.dots = df1 %>% select(contains("Sepal")) %>% colnames()) %>%
  summarise(mySum = sum(Petal.Length))

though it's not necessarily much prettier, but it gets rid of the regex

like image 45
Aramis7d Avatar answered Sep 22 '22 06:09

Aramis7d