Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boxplot width in ggplot with cross classified groups

Tags:

r

ggplot2

I am making boxplots with ggplot with data that is classified by 2 factor variables. I'd like to have the box sizes reflect sample size via varwidth = TRUE but when I do this the boxes overlap.

1) Some sample data with a 3 x 2 structure

data <- data.frame(group1= sample(c("A","B","C"),100, replace = TRUE),group2= sample(c("D","E"),100, replace = TRUE) ,response = rnorm(100, mean = 0, sd = 1))

2) Default boxplots: ggplot without variable width

ggplot(data = data, aes(y = response, x = group1, color = group2)) + geom_boxplot()

enter image description here

I like how the first level of grouping is shown.
Now I try to add variable widths...

3) ...and What I get when varwidth = TRUE

ggplot(data = data, aes(y = response, x = group1, color = group2)) + geom_boxplot(varwidth = T)

enter image description here

This overlap seems to occur whether I use color = group2 or group = group2 in both the main call to ggplot and in the geom_boxplot statement. Fussing with position_dodge doesn't seem to help either.

4) A solution I don't like visually is to make unique factors by combining my group1 and group2

data$grp.comb <- paste(data$group1, data$group2)

ggplot(data = data, aes(y = response, x = grp.comb, color = group2)) + geom_boxplot()

enter image description here

I prefer having things grouped to reflect the cross classification

5) The way forward: I'd like to either a)figure out how to either make varwidth = TRUE not cause the boxes to overlap or b)manually adjusted the space between the combined groups so that boxes within the 1st level of grouping are closer together.

like image 996
N Brouwer Avatar asked Aug 13 '14 20:08

N Brouwer


2 Answers

I think your problem can be solved best by using facet_wrap.

    library(ggplot2)
    data <- data.frame(group1= sample(c("A","B","C"),100, replace = TRUE), group2= 
    sample(c("D","E"),100, replace = TRUE) ,response = rnorm(100, mean = 0, sd = 1))

    ggplot(data = data, aes(y = response, x = group2, color = group2)) + 
      geom_boxplot(varwidth = TRUE) +
      facet_wrap(~group1)

Which gives: enter image description here

like image 114
RHA Avatar answered Nov 20 '22 05:11

RHA


A recent update to ggplot2 makes it so that the code provided by @N Brouwer in (3) works as expected:

# library(devtools)
# install_github("tidyverse/ggplot2")

packageVersion("ggplot2") # works with v2.2.1.9000
library(ggplot2)
set.seed(1234)
data <- data.frame(group1= sample(c("A","B","C"), 100, replace = TRUE),
                   group2= sample(c("D","E"), 100, replace = TRUE),
                   response = rnorm(100, mean = 0, sd = 1))

ggplot(data = data, aes(y = response, x = group1, color = group2)) + 
  geom_boxplot(varwidth = T)

(I'm a new user and can't post images inline) fig 1

like image 39
Jai Broome Avatar answered Nov 20 '22 05:11

Jai Broome