Consider the following example data <pre class="prettyprint"><code>library(dplyr) tmp <- mtcars %>% group_by(cyl) %>% summarise(mpg_sum = list(summary(mpg))) </code></pre> such that <code>mpg_sum</code> contains the min, 1st quartile, median, mean, 3rd quartile, and max of the <code>mpg</code> variable by groups in <code>cyl</code>. How do I unpack this column into 6 columns with appropriate column names with dplyr, or otherwise?

We can use <code>data.table</code>. Convert the 'data.frame' to 'data.table' (<code>as.data.table(mtcars)</code>), grouped by 'cyl', we get the <code>summary</code> of 'mpg' and convert it to <code>list</code> <pre class="prettyprint"><code>library(data.table) as.data.table(mtcars)[, as.list(summary(mpg)), by = cyl] # cyl Min. 1st Qu. Median Mean 3rd Qu. Max. #1: 6 17.8 18.65 19.7 19.74 21.00 21.4 #2: 4 21.4 22.80 26.0 26.66 30.40 33.9 #3: 8 10.4 14.40 15.2 15.10 16.25 19.2 </code></pre> <hr> Or using only <code>dplyr</code>, after grouping by 'cyl', we use <code>do</code> to do the same operation as above. <pre class="prettyprint"><code>library(dplyr) mtcars %>% group_by(cyl) %>% do(data.frame(as.list(summary(.$mpg)), check.names=FALSE) ) # cyl Min. 1st Qu. Median Mean 3rd Qu. Max. # <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #1 4 21.4 22.80 26.0 26.66 30.40 33.9 #2 6 17.8 18.65 19.7 19.74 21.00 21.4 #3 8 10.4 14.40 15.2 15.10 16.25 19.2 </code></pre> Or using <code>purrr</code> <pre class="prettyprint"><code>library(purrr) mtcars %>% slice_rows("cyl") %>% select(mpg) %>% by_slice(dmap, summary, .collate= "cols") </code></pre>

Split a data frame column containing a list into multiple columns using dplyr (or otherwise)

Tags:

r

dplyr

Consider the following example data

library(dplyr)
tmp <- mtcars %>%
    group_by(cyl) %>%
    summarise(mpg_sum = list(summary(mpg)))

such that mpg_sum contains the min, 1st quartile, median, mean, 3rd quartile, and max of the mpg variable by groups in cyl.

How do I unpack this column into 6 columns with appropriate column names with dplyr, or otherwise?

872

asked Jul 04 '16 06:07

Alex

2 Answers

We can use data.table. Convert the 'data.frame' to 'data.table' (as.data.table(mtcars)), grouped by 'cyl', we get the summary of 'mpg' and convert it to list

library(data.table)
as.data.table(mtcars)[, as.list(summary(mpg)), by = cyl]
#    cyl Min. 1st Qu. Median  Mean 3rd Qu. Max.
#1:   6 17.8   18.65   19.7 19.74   21.00 21.4
#2:   4 21.4   22.80   26.0 26.66   30.40 33.9
#3:   8 10.4   14.40   15.2 15.10   16.25 19.2

Or using only dplyr, after grouping by 'cyl', we use do to do the same operation as above.

library(dplyr)
mtcars %>%
     group_by(cyl) %>%
     do(data.frame(as.list(summary(.$mpg)), check.names=FALSE) )
#   cyl  Min. 1st Qu. Median  Mean 3rd Qu.  Max.
#  <dbl> <dbl>   <dbl>  <dbl> <dbl>   <dbl> <dbl>
#1     4  21.4   22.80   26.0 26.66   30.40  33.9
#2     6  17.8   18.65   19.7 19.74   21.00  21.4
#3     8  10.4   14.40   15.2 15.10   16.25  19.2

Or using purrr

library(purrr)
mtcars %>% 
     slice_rows("cyl") %>% 
     select(mpg) %>%
     by_slice(dmap, summary, .collate= "cols")

173

answered Oct 10 '22 19:10

akrun

As commented, you can also use the tidy function from package broom:

library(broom)
mtcars %>% group_by(cyl) %>% do(tidy(summary(.$mpg)))
# Source: local data frame [3 x 7]
# Groups: cyl [3]
# 
#     cyl minimum    q1 median  mean    q3 maximum
#   (dbl)   (dbl) (dbl)  (dbl) (dbl) (dbl)   (dbl)
# 1     4    21.4 22.80   26.0 26.66 30.40    33.9
# 2     6    17.8 18.65   19.7 19.74 21.00    21.4
# 3     8    10.4 14.40   15.2 15.10 16.25    19.2

answered Oct 10 '22 19:10

talat

Related questions
                            
                                Separate rows into columns using the first split character
                            
                                How many numbers after the decimal point can you show using R?
                            
                                How to grep a word exactly
                            
                                Function for median similar to "which.max" and "which.min" / Extracting median rows from a data.frame
                            
                                Get width of plot area in ggplot2
                            
                                Calculating statistics on subsets of data [duplicate]
                            
                                How can I tell if a certain package was already installed?
                            
                                Get list of available data frames
                            
                                How to convert CamelCase to not.camel.case in R
                            
                                devtools::install_github Error in function (type, msg, asError = TRUE) : <not set>
                            
                                Find minimum non-zero value in a column R
                            
                                Concatenate char vector with | separator
                            
                                ggplot and two different geom_line(): the legend does not appear
                            
                                pie chart with ggplot2 with specific order and percentage annotations
                            
                                apply function to every element in data.frame and return data.frame
                            
                                Viewing tables of data in R
                            
                                R getting substrings and regular expressions?
                            
                                R TwitteR package authorization error
                            
                                Finding Optimal Lambda for Box-Cox Transform in R
                            
                                Add space between two letters in a string in R [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With