I want to add summary statistics in histogram plot made using ggplot2
. I am using the following code
#Loading the required packages
library(dplyr)
library(ggplot2)
library(reshape2)
library(moments)
library(ggpmisc)
#Loading the data
df <- iris
df.m <- melt(df, id="Species")
#Calculating the summary statistics
summ <- df.m %>%
group_by(variable) %>%
summarize(min = min(value), max = max(value),
mean = mean(value), q1= quantile(value, probs = 0.25),
median = median(value), q3= quantile(value, probs = 0.75),
sd = sd(value), skewness=skewness(value), kurtosis=kurtosis(value))
#Histogram plotting
p1 <- ggplot(df.m) + geom_histogram(aes(x = value), fill = "grey", color = "black") +
facet_wrap(~variable, scales="free", ncol = 2)+ theme_bw()
p1+geom_table_npc(data = summ, label = list(summ),npcx = 0.00, npcy = 1, hjust = 0, vjust = 1)
It is giving me the following plot
Every facet is having summary statistics of all the variables. I want it should show the summary statistics of the faceted variable only. How to do it?
R provides a wide range of functions for obtaining summary statistics. One method of obtaining descriptive statistics is to use the sapply( ) function with a specified summary statistic. Possible functions used in sapply include mean, sd, var, min, max, median, range, and quantile.
You can also make histograms by using ggplot2 , “a plotting system for R, based on the grammar of graphics” that was created by Hadley Wickham. This post will focus on making a Histogram With ggplot2.
A histogram and a combined dot-, box-, mean-, percentile- and SD- plot give a visual summary and statistics such as the mean, standard deviation skewness, kurtosis and median, percentiles summarise the sample numerically.
Basic histogram with geom_histogram It is relatively straightforward to build a histogram with ggplot2 thanks to the geom_histogram() function. Only one numeric variable is needed in the input.
You need to split your data.frame:
p1+geom_table_npc(data=summ,label =split(summ,summ$variable),
npcx = 0.00, npcy = 1, hjust = 0, vjust = 1,size=2)
or nest the summary table you have:
summ <- summ %>% nest(data=-c(variable))
# A tibble: 4 x 2
variable data
<fct> <list<df[,9]>>
1 Sepal.Length [1 × 9]
2 Sepal.Width [1 × 9]
3 Petal.Length [1 × 9]
4 Petal.Width [1 × 9]
p1+geom_table_npc(data = summ,label =summ$data,
,npcx = 0.00, npcy = 1, hjust = 0, vjust = 1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With