I would like to create a box plot for grouped data that shows the mean of each group as a point in the box. Using the following code, I only get a single point for the two groups.
df <- data.frame(a=factor(rbinom(100, 1, 0.45), label=c("m","w")),
b=factor(rbinom(100, 1, 0.3), label=c("young","old")),
c=rnorm(100))
ggplot(aes(y = c, x = b, fill = a), data = df) +
geom_boxplot() +
stat_summary(fun.y="mean", geom="point", shape=21, size=5, fill="white")
In order to create a box plot by group in R you can pass a formula of the form y ~ x , being x a numerical variable and y a categoriacal variable to the boxplot function. You can pass the variables accessing the data from the data frame using the dollar sign or subsetting the data frame.
A boxplot summarizes the distribution of a continuous variable and notably displays the median of each group.
In order to show mean values in boxplot using ggplot2, we use the stat_summary() function to compute new summary statistics and add them to the plot. We use stat_summary() function with ggplot() function.
In case you need to plot a different boxplot for each column of your R dataframe you can use the lapply function and iterate over each column. In this case, we will divide the graphics par in one row and as many columns as the dataset has, but you could plot individual graphs.
Part of the problem was changing the fill of the point, since the fill is the property that determines that two box plots of different color should be drawn, the point behaves as if there were only one group again. I think this should give you what you want.
ggplot(df, aes(x=b, y=c, fill=a)) +
geom_boxplot() +
stat_summary(fun.y="mean", geom="point", size=5,
position=position_dodge(width=0.75), color="white")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With