Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boxplots for groups?

Tags:

r

boxplot

I have a dataset (test) as given below:

Type    Met1    Met2    Met3    Met4
TypeA   65  43  97  77
TypeA   46  25  76  77
TypeA   44  23  55  46
TypeA   46  44  55  77
TypeA   33  22  55  54
TypeB   66  8   66  47
TypeB   55  76  66  65
TypeB   55  77  88  46
TypeB   36  67  55  44
TypeB   67  55  76  65

I have checked a lot of links on box plots, but I still have not succeeded for the type of box plot I want. I wish to have a boxplot with my X-axis having type A (yellow, orange) for all the Mets (Met1, Met2, Met3, Met4). In essence, I want something like the following (taken from here):

enter image description here

I am trying somethings like,

boxplot(formula = len ~ Type , data = test, subset == "TypeA")
boxplot(formula = len ~ Type , data = test, subset == "TypeA", add=TRUE)
Legend(legend = c( "typeA", "typeB" ), fill = c( "yellow", "orange" ) )

But I am not able to work it out with any of it. Can anyone help me know how do I make such box plots on my test data in the corrected way?

like image 407
Letin Avatar asked Jan 30 '13 16:01

Letin


People also ask

How do I make a grouped Boxplot?

Box plot for multiple groups In order to create a box plot by group in R you can pass a formula of the form y ~ x , being x a numerical variable and y a categoriacal variable to the boxplot function. You can pass the variables accessing the data from the data frame using the dollar sign or subsetting the data frame.

Are Boxplots good for comparing multiple groups?

Boxplots are particularly useful for assessing quickly the location, dispersion, and symmetry or skewness of a set of data, and for making comparisons of these features in two or more data sets.

Can you use Boxplots for categorical data?

Use boxplots and individual value plots when you have a categorical grouping variable and a continuous outcome variable. The levels of the categorical variables form the groups in your data, and the researchers measure the continuous variable.


2 Answers

A solution with ggplot2.

First, transform your data frame test to the long format using melt:

library(reshape2)
test.m <- melt(test)

Plot the data:

library(ggplot2)
ggplot(test.m, aes(x = variable, y = value, fill = Type)) +
  geom_boxplot() +
  scale_fill_manual(values = c("yellow", "orange"))

enter image description here

like image 80
Sven Hohenstein Avatar answered Sep 22 '22 07:09

Sven Hohenstein


As others have said, first you need to melt your data.

df <- read.table(text="Type    Met1    Met2    Met3    Met4
TypeA   65  43  97  77
TypeA   46  25  76  77
TypeA   44  23  55  46
TypeA   46  44  55  77
TypeA   33  22  55  54
TypeB   66  8   66  47
TypeB   55  76  66  65
TypeB   55  77  88  46
TypeB   36  67  55  44
TypeB   67  55  76  65",header=TRUE)

library(reshape2)
df2 <- melt(df)

boxplot(
  formula = value ~ variable,
  data    = df2,
  boxwex  = 0.25,
  at      = 1:4 - 0.2,
  subset  = Type == "TypeA",
  col     = "yellow",
  main    = "blah",
  xlab    = "x",
  ylab    = "y",
  ylim    = c( 0, ceiling( max( df2$value ) ) + 1 ),
  yaxs    = "i" )


boxplot(
  formula = value ~ variable,
  data    = df2,
  boxwex  = 0.25,
  at      = 1:4 + 0.2,
  subset  = Type == "TypeB",
  col     = "orange",
  add     = TRUE )
like image 44
Roland Avatar answered Sep 20 '22 07:09

Roland