Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boxplot show the value of mean

Tags:

r

ggplot2

People also ask

Can a Boxplot show the mean?

Box plots are useful as they provide a visual summary of the data enabling researchers to quickly identify mean values, the dispersion of the data set, and signs of skewness.

How do you show mean in a Boxplot in R?

R. Output: In order to show mean values in boxplot using ggplot2, we use the stat_summary() function to compute new summary statistics and add them to the plot. We use stat_summary() function with ggplot() function.

Does Boxplot show mean and standard deviation?

In addition to showing median, first and third quartile and maximum and minimum values, the Box and Whisker chart is also used to depict Mean, Standard Deviation, Mean Deviation and Quartile Deviation.


First, you can calculate the group means with aggregate:

means <- aggregate(weight ~  group, PlantGrowth, mean)

This dataset can be used with geom_text:

library(ggplot2)
ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) + geom_boxplot() +
  stat_summary(fun=mean, colour="darkred", geom="point", 
               shape=18, size=3, show.legend=FALSE) + 
  geom_text(data = means, aes(label = weight, y = weight + 0.08))

Here, + 0.08 is used to place the label above the point representing the mean.

enter image description here


An alternative version without ggplot2:

means <- aggregate(weight ~  group, PlantGrowth, mean)

boxplot(weight ~ group, PlantGrowth)
points(1:3, means$weight, col = "red")
text(1:3, means$weight + 0.08, labels = means$weight)

enter image description here


You can use the output value from stat_summary()

ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) 
+ geom_boxplot() 
+ stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=3,show_guide = FALSE)
+ stat_summary(fun.y=mean, colour="red", geom="text", show_guide = FALSE, 
               vjust=-0.7, aes( label=round(..y.., digits=1)))

You can also use a function within stat_summary to calculate the mean and the hjust argument to place the text, you need a additional function but no additional data frame:

fun_mean <- function(x){
  return(data.frame(y=mean(x),label=mean(x,na.rm=T)))}


ggplot(PlantGrowth,aes(x=group,y=weight)) +
geom_boxplot(aes(fill=group)) +
stat_summary(fun.y = mean, geom="point",colour="darkred", size=3) +
stat_summary(fun.data = fun_mean, geom="text", vjust=-0.7)

enter image description here


The Magrittr way

I know there is an accepted answer already, but I wanted to show one cool way to do it in single command with the help of magrittr package.

PlantGrowth %$% # open dataset and make colnames accessible with '$'
split(weight,group) %T>% # split by group and side-pipe it into boxplot
boxplot %>% # plot
lapply(mean) %>% # data from split can still be used thanks to side-pipe '%T>%'
unlist %T>% # convert to atomic and side-pipe it to points
points(pch=18)  %>% # add points for means to the boxplot
text(x=.+0.06,labels=.) # use the values to print text

This code will produce a boxplot with means printed as points and values: boxplot with means

I split the command on multiple lines so I can comment on what each part does, but it can also be entered as a oneliner. You can learn more about this in my gist.