Is there a way to create a boxplot in R that will display with the box (somewhere) an "N=(sample size)"? The varwidth logical adjusts the width of the box on the basis of sample size, but that doesn't allow comparisons between different plots.
FWIW, I am using the boxplot command in the following fashion, where 'f1' is a factor:
boxplot(xvar ~ f1, data=frame, xlab="input values", horizontal=TRUE)
Here's some ggplot2 code. It's going to display the sample size at the sample mean, making the label multifunctional!
First, a simple function for fun.data
give.n <- function(x){
return(c(y = mean(x), label = length(x)))
}
Now, to demonstrate with the diamonds data
ggplot(diamonds, aes(cut, price)) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text")
You may have to play with the text size to make it look good, but now you have a label for the sample size which also gives a sense of the skew.
You can use the names
parameter to write the n
next to each factor name.
If you don't want to calculate the n
yourself you could use this little trick:
# Do the boxplot but do not show it
b <- boxplot(xvar ~ f1, data=frame, plot=0)
# Now b$n holds the counts for each factor, we're going to write them in names
boxplot(xvar ~ f1, data=frame, xlab="input values", names=paste(b$names, "(n=", b$n, ")"))
To get the n
on top of the bar, you could use text
with the stat
details provided by boxplot as follows
b <- boxplot(xvar ~ f1, data=frame, plot=0)
text(1:length(b$n), b$stats[5,]+1, paste("n=", b$n))
The stats field of b is a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With