Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr - Multiple summary functions

Tags:

r

dplyr

I am trying to calculate multiple stats for a dataframe.

I tried dplyr's summarise_each. However, the results are returned in a flat, single-row with the function's name added as a suffix.

Is there a direct way - using dplyr or base r - where I can get the results in a data frame, with the columns as the data frame's columns and the rows as the summary functions?

library(dplyr)

df = data.frame(A = sample(1:100, 20), 
                B = sample(110:200, 20), 
                C = sample(c(0,1), 20, replace = T))

df %>% summarise_each(funs(min, max)) 
# A_min B_min C_min A_max B_max C_max
# 1    13   117     0    98   188     1

# Desired format
summary(df)
# A               B               C       
# Min.   :13.00   Min.   :117.0   Min.   :0.00  
# 1st Qu.:34.75   1st Qu.:134.2   1st Qu.:0.00  
# Median :45.00   Median :148.0   Median :1.00  
# Mean   :52.35   Mean   :149.9   Mean   :0.65  
# 3rd Qu.:62.50   3rd Qu.:168.8   3rd Qu.:1.00  
# Max.   :98.00   Max.   :188.0   Max.   :1.00  
like image 993
Deena Avatar asked Nov 02 '16 08:11

Deena


1 Answers

How about:

library(tidyr)
gather(df) %>% group_by(key) %>% summarise_all(funs(min, max))
# A tibble: 3 × 3
    key   min   max
  <chr> <dbl> <dbl>
1     A     2    92
2     B   111   194
3     C     0     1
like image 139
Axeman Avatar answered Sep 22 '22 05:09

Axeman