Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting several groups of box plots side-by-side in R

Tags:

plot

r

boxplot

I am trying to plot two box-plots in the same plot, each within the same category. I can generate the boxplots individually, but am stumped when I try to get them onto the same graph.

Here is what I have so far:

a<-matrix(nrow=100,ncol=3,data=runif(300,max=2))
b<-matrix(nrow=100,ncol=3,data=runif(300,max=1))
colnames(a)<-c("case 1","case 2","case 3")
colnames(b)<-c("case 1","case 2","case 3")
boxplot(cbind(a,b))

This plot results in 6 boxplots, first 3 for a, then 3 for b.

Is there a trick/simple option that I am missing that will give me first value for a and b, then second and finally the third set of values, all plotted in such a way there are is only three ticks on the x-axis, one for each of the sets?

Any pointers greatly appreciated,

Iain

like image 541
Iain Avatar asked Oct 12 '11 10:10

Iain


People also ask

How do you make a boxplot with multiple groups in R?

Box plot for multiple groups In order to create a box plot by group in R you can pass a formula of the form y ~ x , being x a numerical variable and y a categoriacal variable to the boxplot function. You can pass the variables accessing the data from the data frame using the dollar sign or subsetting the data frame.

Can Boxplots plot several data sets side by side?

Side-By-Side boxplots are used to display the distribution of several quantitative variables or a single quantitative variable along with a categorical variable.

How do I show Boxplots side by side in R?

Thus to implement this approach, first, the data to create each boxplot is initialized and then all of these are combined using cbind(). The combined data is then passed to the boxplot function along with beside parameter set to TRUE to draw them side by side. Example 1: R.


1 Answers

boxplot(a, at = 0:2*3 + 1, xlim = c(0, 9), ylim = range(a, b), xaxt = "n")
boxplot(b, at = 0:2*3 + 2, xaxt = "n", add = TRUE)
axis(1, at = 0:2*3 + 1.5, labels = colnames(a), tick = TRUE)

Note the ylim = range(a, b) parameter. The plot scale is determined by the first command, but if b contained values out of range of values in a (not in this case, but try to swap a and b), they would lie out of the plot. That's why in general you should specify the ylim here.

You can also set tick = FALSE in the axis() command, I think it is nicer. If you don't like the space between the groups, use 0:2*2 instead of 0:2*3, and change the xlim appropriatelly.

like image 162
Tomas Avatar answered Oct 17 '22 01:10

Tomas