tidyverse summarize multiple columns but show result as rows

Tags:

I have data where I want to get a bunch of summary statistics for multiple columns with the tidyverse approach. However, utilizing tidyverse's summarize function, it will create each column statistic as a new column, whereas I would prefer to see the column names as rows and each statistic as a new column. So my question is:

Is there a more elegant (and I know "elegant" is a vague term) way to achieve this than by accompanying the summarize function with a pivot_longer and pivot_wider?

I'm using the latest dev versions of the tidyverse package, i.e. dplyr 0.8.99.9003 and tidyr 1.1.0. So it's fine if any solution requires new functions from these packages that are not yet on CRAN.

library(tidyverse)

dat <- as.data.frame(matrix(1:100, ncol = 5))

dat %>%
  summarize(across(everything(), list(mean = mean,
                                      sum  = sum))) %>%
  pivot_longer(cols      = everything(),
               names_sep = "_",
               names_to  = c("variable", "statistic")) %>%
  pivot_wider(names_from = "statistic")

Expected outcome:

# A tibble: 5 x 3
  variable  mean   sum
  <chr>    <dbl> <dbl>
1 V1        10.5   210
2 V2        30.5   610
3 V3        50.5  1010
4 V4        70.5  1410
5 V5        90.5  1810

Note: I'm not set on the name of any of the columns, so if there's a nice way to get the structure of the table with different/generic names, that'd also be fine.

657

asked May 27 '20 11:05

deschen

2 Answers

You can skip the pivot_wider step by using ".value" in names_to.

library(dplyr)

dat %>%
  summarise_all(list(mean = mean,sum  = sum)) %>%
  tidyr::pivot_longer(cols = everything(),
               names_sep = "_",
               names_to  = c("variable", ".value"))


# A tibble: 5 x 3
#  variable  mean   sum
#  <chr>    <dbl> <int>
#1 V1        10.5   210
#2 V2        30.5   610
#3 V3        50.5  1010
#4 V4        70.5  1410
#5 V5        90.5  1810

198

answered Sep 28 '22 08:09

Ronak Shah

not a tidyverse solution, but a data.table one instead.. also, not sure if it is more 'elegant' ;-)

but here you go...

library( data.table )
#make 'dat' a data.table
setDT(dat)
#transpose, keeping column names
dat <- transpose(dat, keep.names = "var_name" )
#melt to long and summarise
melt(dat, id.vars = "var_name")[, .(mean = mean(value), sum = sum(value) ), by = var_name]


#    var_name mean  sum
# 1:       V1 10.5  210
# 2:       V2 30.5  610
# 3:       V3 50.5 1010
# 4:       V4 70.5 1410
# 5:       V5 90.5 1810

answered Sep 28 '22 08:09

Wimpel

Related questions
                            
                                R Mutate multiple columns with ifelse()-condition
                            
                                Reading numpy ndarrays into R?
                            
                                How to format the input of Shiny updated numericInput but not change the actual value?
                            
                                Extract p-value from checkresiduals function
                            
                                Converting unit abbreviations to numbers
                            
                                Change filename when downloading data from datatable R
                            
                                Using the R cut function - how do the breaks and labels options work
                            
                                Recommended way to subset two vectors with the same index vector
                            
                                Reconvert numeric date to POSIXct R
                            
                                How to get quantiles to work with summarise_at and group_by (dplyr)
                            
                                R: Force regression coefficients to add up to 1
                            
                                translate this loop into purr?
                            
                                Rails 6.0 action text couldn't find file 'trix/dist/trix' with type 'text/css'
                            
                                How to convert scientific notation to decimal in tibbles?
                            
                                Emulating reshape2::melt with pivot_longer for matrixes
                            
                                How to dodge overlapping segments to keep them parallel
                            
                                Looking for a dplyr function to apply a filter conditionally
                            
                                R use mapply on nested list
                            
                                How to avoid excessive lambda functions in pandas DataFrame assign and apply method chains
                            
                                R: How to identify unknown number of combinations?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

tidyverse summarize multiple columns but show result as rows

Tags:

r

dplyr

summarize

tidyr

deschen

People also ask

2 Answers

Ronak Shah

Wimpel

Recent Activity

Donate For Us