I am trying to calculate multiple stats for a dataframe.
I tried dplyr
's summarise_each
. However, the results are returned in a flat, single-row with the function's name added as a suffix.
Is there a direct way - using dplyr
or base r - where I can get the results in a data frame, with the columns as the data frame's columns and the rows as the summary functions?
library(dplyr)
df = data.frame(A = sample(1:100, 20),
B = sample(110:200, 20),
C = sample(c(0,1), 20, replace = T))
df %>% summarise_each(funs(min, max))
# A_min B_min C_min A_max B_max C_max
# 1 13 117 0 98 188 1
# Desired format
summary(df)
# A B C
# Min. :13.00 Min. :117.0 Min. :0.00
# 1st Qu.:34.75 1st Qu.:134.2 1st Qu.:0.00
# Median :45.00 Median :148.0 Median :1.00
# Mean :52.35 Mean :149.9 Mean :0.65
# 3rd Qu.:62.50 3rd Qu.:168.8 3rd Qu.:1.00
# Max. :98.00 Max. :188.0 Max. :1.00
How about:
library(tidyr)
gather(df) %>% group_by(key) %>% summarise_all(funs(min, max))
# A tibble: 3 × 3 key min max <chr> <dbl> <dbl> 1 A 2 92 2 B 111 194 3 C 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With