Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using dplyr summarise_at with column index

Tags:

I noticed that when supplying column indices to dplyr::summarize_at the column to be summarized is determined excluding the grouping column(s). I wonder if that is how it's supposed to be since by this design, using the correct column index depends on whether the summarising column(s) are positioned before or after the grouping columns.

Here's an example:

library(dplyr) data("mtcars")  # grouping column after summarise columns mtcars %>% group_by(gear) %>% summarise_at(3:4, mean) ## A tibble: 3 x 3 #   gear     disp       hp #  <dbl>    <dbl>    <dbl> #1     3 326.3000 176.1333 #2     4 123.0167  89.5000 #3     5 202.4800 195.6000  # grouping columns before summarise columns mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean) ## A tibble: 3 x 3 #    cyl        hp     drat #  <dbl>     <dbl>    <dbl> #1     4  82.63636 4.070909 #2     6 122.28571 3.585714 #3     8 209.21429 3.229286  # no grouping columns mtcars %>% summarise_at(3:4, mean) #      disp       hp #1 230.7219 146.6875  # actual third & fourth columns names(mtcars)[3:4] #[1] "disp" "hp"    packageVersion("dplyr") #[1] ‘0.7.2’ 

Notice how the summarised columns change depending on grouping and position of the grouping column.

Is this the same on other platforms? Is it a bug or a feature?

like image 670
talat Avatar asked Aug 25 '17 14:08

talat


1 Answers

with version 0.7.5 this behavior can't be reproduced anymore:

  library(dplyr)   mtcars %>% group_by(gear) %>% summarise_at(3:4, mean)   # # A tibble: 3 x 3   #    gear  disp    hp   #   <dbl> <dbl> <dbl>   # 1     3  326. 176.    # 2     4  123.  89.5   # 3     5  202. 196.     mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean)   # # A tibble: 3 x 3   #     cyl  disp    hp   #   <dbl> <dbl> <dbl>   # 1     4  105.  82.6   # 2     6  183. 122.    # 3     8  353. 209.  
like image 88
Moody_Mudskipper Avatar answered Sep 19 '22 10:09

Moody_Mudskipper