Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient assignment of a function with multiple outputs in dplyr mutate or summarise

Tags:

r

dplyr

I've noticed a lot of examples here which uses dplyr::mutate in combination with a function returning multiple outputs to create multiple columns. For example:

tmp <- mtcars %>%
    group_by(cyl) %>%
    summarise(min = summary(mpg)[1],
              median = summary(mpg)[3],
              mean = summary(mpg)[4],
              max = summary(mpg)[6])

Such syntax however means that the summary function is called 4 times, in this example, which does not seem particularly efficient. What ways are there to efficiently assign a list output to a list of column names in summarise or mutate?

For example, from a previous question: Split a data frame column containing a list into multiple columns using dplyr (or otherwise), I know that you can assign the output of summary as a list and then split it using do(data.frame(...)), however this means that you have to then add the column names later and the syntax is not as pretty.

like image 548
Alex Avatar asked Jul 06 '16 11:07

Alex


1 Answers

This can also be accomplished using tidyr::nest and purrr::map. Note, the output returned by summary needs to be converted from a named vector to a data.frame or tibble, I'm using dplyr::bind_rows below to accomplish this but equally data.frame(as.list(summary(.$mpg))) could be used instead.


suppressWarnings(library(tidyverse))

mtcars %>%
  group_by(cyl) %>%
  nest() %>% 
  summarise(stats = map(data, ~ bind_rows(summary(.$mpg)))) %>% 
  unnest(stats)
#> # A tibble: 3 x 7
#>     cyl Min.    `1st Qu.` Median  Mean     `3rd Qu.` Max.   
#>   <dbl> <table> <table>   <table> <table>  <table>   <table>
#> 1     4 21.4    22.80     26.0    26.66364 30.40     33.9   
#> 2     6 17.8    18.65     19.7    19.74286 21.00     21.4   
#> 3     8 10.4    14.40     15.2    15.10000 16.25     19.2

Created on 2021-04-19 by the reprex package (v0.3.0)

like image 134
JWilliman Avatar answered Sep 19 '22 17:09

JWilliman