I noticed that when supplying column indices to dplyr::summarize_at
the column to be summarized is determined excluding the grouping column(s). I wonder if that is how it's supposed to be since by this design, using the correct column index depends on whether the summarising column(s) are positioned before or after the grouping columns.
Here's an example:
library(dplyr) data("mtcars") # grouping column after summarise columns mtcars %>% group_by(gear) %>% summarise_at(3:4, mean) ## A tibble: 3 x 3 # gear disp hp # <dbl> <dbl> <dbl> #1 3 326.3000 176.1333 #2 4 123.0167 89.5000 #3 5 202.4800 195.6000 # grouping columns before summarise columns mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean) ## A tibble: 3 x 3 # cyl hp drat # <dbl> <dbl> <dbl> #1 4 82.63636 4.070909 #2 6 122.28571 3.585714 #3 8 209.21429 3.229286 # no grouping columns mtcars %>% summarise_at(3:4, mean) # disp hp #1 230.7219 146.6875 # actual third & fourth columns names(mtcars)[3:4] #[1] "disp" "hp" packageVersion("dplyr") #[1] ‘0.7.2’
Notice how the summarised columns change depending on grouping and position of the grouping column.
Is this the same on other platforms? Is it a bug or a feature?
with version 0.7.5
this behavior can't be reproduced anymore:
library(dplyr) mtcars %>% group_by(gear) %>% summarise_at(3:4, mean) # # A tibble: 3 x 3 # gear disp hp # <dbl> <dbl> <dbl> # 1 3 326. 176. # 2 4 123. 89.5 # 3 5 202. 196. mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean) # # A tibble: 3 x 3 # cyl disp hp # <dbl> <dbl> <dbl> # 1 4 105. 82.6 # 2 6 183. 122. # 3 8 353. 209.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With