Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine two different dplyr summaries in a single command

Tags:

r

dplyr

summarize

I am trying to create a grouped summary that reports the number of records in each group and then also shows the means of a series of variables.

I can only work out how to do this as two separate summaries which I then join together. This works fine but I wonder if there is a more elegant way to do this?

dailyn<-daily %>% # this summarises n
  group_by(type) %>%
  summarise(n=n()) %>%

dailymeans <- daily %>% # this summarises the means
  group_by(type) %>%
  summarise_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%

dailysummary<-inner_join(dailyn,dailymeans) #this joins the two parts together

The data I'm working with is a dataframe like this:

daily<-data.frame(type=c("A","A","B","C","C","C"),
                  d.happy=c(1,5,3,7,2,4),
                  d.sad=c(5,3,6,3,1,2))
like image 457
mob Avatar asked May 29 '17 10:05

mob


2 Answers

You can do this in one call, by grouping, using mutate instead of summarize, and then use slice() to keep the first row of each type:

daily %>% group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>% 
  slice(1L)

Edit: It might be clearer how this works, in this modified example

daily_summary <- daily %>% group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs("mean" = mean(., na.rm = TRUE)))

daily_summary
# Source: local data frame [6 x 6]
# Groups: type [3]
# 
# # A tibble: 6 x 6
#    type d.happy d.sad     n d.happy_mean d.sad_mean
#  <fctr>   <dbl> <dbl> <int>        <dbl>      <dbl>
#1      A       1     5     2     3.000000          4
#2      A       5     3     2     3.000000          4
#3      B       3     6     1     3.000000          6
#4      C       7     3     3     4.333333          2
#5      C       2     1     3     4.333333          2
#6      C       4     2     3     4.333333          2

daily_summary %>% 
  slice(1L)

# Source: local data frame [3 x 6]
# Groups: type [3]
# 
# # A tibble: 3 x 6
#    type d.happy d.sad     n d.happy_mean d.sad_mean
#  <fctr>   <dbl> <dbl> <int>        <dbl>      <dbl>
#1      A       1     5     2     3.000000          4
#2      B       3     6     1     3.000000          6
#3      C       7     3     3     4.333333          2
like image 78
emiltb Avatar answered Sep 28 '22 00:09

emiltb


Similar to this question, you can try:

daily %>% 
  group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
  unique

which gives:

Source: local data frame [3 x 4]
Groups: type [3]

    type  d.happy d.sad     n
  <fctr>    <dbl> <dbl> <int>
1      A 3.000000     4     2
2      B 3.000000     6     1
3      C 4.333333     2     3
like image 29
Aramis7d Avatar answered Sep 28 '22 02:09

Aramis7d