When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance:
library(tidyverse)
df <- tribble(
~category, ~x, ~y, ~z,
#----------------------
'a', 4, 6, 8,
'a', 7, 3, 0,
'a', 7, 9, 0,
'b', 2, 8, 8,
'b', 5, 1, 8,
'b', 8, 0, 1,
'c', 2, 1, 1,
'c', 3, 8, 0,
'c', 1, 9, 1
)
df %>% group_by(category) %>% summarize(
x=mean(x),
y=median(y),
z=first(z)
)
results in output:
# A tibble: 3 x 4
category x y z
<chr> <dbl> <dbl> <dbl>
1 a 6 6 8
2 b 5 1 8
3 c 2 8 1
My question is, how would I do this with summarise_at? Obviously for this example it's unnecessary, but assume I have lots of variables that I want to take the mean of, lots of medians, etc.
Do I lose this functionality once I move to summarise_at? Do I have to use all functions on all groups of variables and then throw away the ones I don't want?
Perhaps I'm just missing something, but I can't figure it out, and I don't see any examples of this in the documentation. Any help is appreciated.
Since your question is about "summarise_at";
Here is what my idea is:
df %>% group_by(category) %>%
summarise_at(vars(x, y, z),
funs(mean = mean, sd = sd, min = min),
na.rm = TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With