Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarize data at different aggregate levels - R and tidyverse

I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to have a few different levels included by default.

#load into RStudio viewer (not required)
iris = iris

#summary at the group level
summary_grouped = iris %>% 
       group_by(Species) %>%
       summarize(mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))

#summary at the overall level
summary_overall = iris %>% 
  summarize(mean_s_length = mean(Sepal.Length),
            max_s_width = max(Sepal.Width)) %>%
  mutate(Species = "Overall")

#append results for report       
summary_table = rbind(summary_grouped, summary_overall)

Doing this multiple times over is very tedious. I kind of want:

summary_overall = iris %>% 
       group_by(Species, total = TRUE) %>%
       summarize(mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))

FYI - if you're familiar with SAS I'm looking for the same type of functionality available via a class, ways or types statements in proc means that let me control the level of summarization and get multiple levels in one call.

Any help is appreciated. I know I can create my own function, but was hoping there is something that already exists. I would also prefer to stick with the tidyverse style of programming though I'm not set on that.

like image 242
Reeza Avatar asked Jun 21 '19 19:06

Reeza


People also ask

How do I summarize multiple columns from a group in R?

To perform summarise on multiple columns, create a vector with the column names and use it with across() function. This example does the group by on department and state columns, summarises on salary & bonus columns, and apply the sum function on each summarised column.

How do you group by and summarize in R?

Group By Summarise R ExampleTo get the dropped dataframe use group_by() function. To use group_by() and summarize() functions, you have to install dplyr first using install. packages('dplyr') and load it using library(dplyr) . All functions in dplyr package take data.

How do I summarize data in R?

You can make use of pipe operator for summarising the data set. Pipe operator comes under magrittr package. Let's load the package. Based on pipe operator you can easily summarize and plot it with the help of ggplot2.


1 Answers

Another alternative:

library(tidyverse)  

iris %>% 
  mutate_at("Species", as.character) %>%
  list(group_by(.,Species), .) %>%
  map(~summarize(.,mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))) %>%
  bind_rows() %>%
  replace_na(list(Species="Overall"))
#> # A tibble: 4 x 3
#>   Species    mean_s_length max_s_width
#>   <chr>              <dbl>       <dbl>
#> 1 setosa              5.01         4.4
#> 2 versicolor          5.94         3.4
#> 3 virginica           6.59         3.8
#> 4 Overall             5.84         4.4
like image 84
Moody_Mudskipper Avatar answered Sep 19 '22 13:09

Moody_Mudskipper