Below is a code for producing a boxplot using ggplot2 I'm trying to modify in order to suit my problem:
library(ggplot2)
set.seed(1)
# create fictitious data
a <- rnorm(10)
b <- rnorm(12)
c <- rnorm(7)
d <- rnorm(15)
# data groups
group <- factor(rep(1:4, c(10, 12, 7, 15)))
# dataframe
mydata <- data.frame(c(a,b,c,d), group)
names(mydata) <- c("value", "group")
# function for computing mean, DS, max and min values
min.mean.sd.max <- function(x) {
r <- c(min(x), mean(x) - sd(x), mean(x), mean(x) + sd(x), max(x))
names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
r
}
# ggplot code
p1 <- ggplot(aes(y = value, x = factor(group)), data = mydata)
p1 <- p1 + stat_summary(fun.data = min.mean.sd.max, geom = "boxplot") + ggtitle("Boxplot con media, 95%CI, valore min. e max.") + xlab("Gruppi") + ylab("Valori")
In my case I do not have the actual data points but rather only their mean and standard deviation (the data are normally distributed). So for this example it will be:
mydata.mine = data.frame(mean = c(mean(a),mean(b),mean(c),mean(d)),sd = c(sd(a),sd(b),sd(c),sd(d)),group = c(1,2,3,4))
However I would still like to produce a boxplot. I thought of defining:
ymin = mean - 3*sd
lower = mean - sd
mean = mean
upper = mean + sd
ymax = mean + 3*sd
but I don't know how to define a function that will access mean and sd of mydata.mine from fun.data in stat_summary. Alternatively, I can just use rnorm
to draw points from a normal parameterized by the mean and sd I have, but the first option seems to me a bit more elegant and simple.
In ggplot2, geom_boxplot() is used to create a boxplot. Let us first create a regular boxplot, for that we first have to import all the required libraries and dataset in use. Then simply put all the attributes to plot by in ggplot() function along with geom_boxplot.
A box and whisker plot—also called a box plot—displays the five-number summary of a set of data. The five-number summary is the minimum, first quartile, median, third quartile, and maximum.
The boxplot compactly displays the distribution of a continuous variable. It visualises five summary statistics (the median, two hinges and two whiskers), and all "outlying" points individually.
ggplot(mydata.mine, aes(x = as.factor(group))) +
geom_boxplot(aes(
lower = mean - sd,
upper = mean + sd,
middle = mean,
ymin = mean - 3*sd,
ymax = mean + 3*sd),
stat = "identity")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With