Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Position-dodge warning with ggplot boxplot?

Tags:

r

ggplot2

I'm trying to make a boxplot with ggplot2 using the following code:

p <- ggplot(
      data,
      aes(d$score, reorder(d$names d$scores, median))
     ) +
       geom_boxplot()

I have factors called names and integers called scores.

My code produces a plot, but the graphic does not depict the boxes (only shows lines) and I get a warning message, "position_dodge requires non-overlapping x intervals." I've tried to adjust the height and width with geom_boxplot(width=5), but this does not seem to fix the problem. Can anyone suggest a possible solution to my problem?

I should point out that my boxplot is rather large and has about 200 name values on the y-axis). Perhaps this is the problem?

like image 860
drbunsen Avatar asked Dec 15 '11 15:12

drbunsen


3 Answers

The number of groups is not the problem; I can see the same thing even when there are only 2 groups. The issue is that ggplot2 draws boxplots vertically (continuous along y, categorical along x) and you are trying to draw them horizontally (continuous along x, categorical along y).

Also, your example has several syntax errors and isn't reproducible because we don't have data/d.

Start with some mock data

dat <- data.frame(scores=rnorm(1000,sd=500), 
                  names=sample(LETTERS, 1000, replace=TRUE))

Corrected version of your example code:

ggplot(dat, aes(scores, reorder(names, scores, median))) + geom_boxplot()

ggplot(dat, aes(scores, reorder(names, scores, median))) + geom_boxplot()

This is the horizontal lines you saw.

If you instead put the categorical on the x axis and the continuous on the y you get

ggplot(dat, aes(reorder(names, scores, median), scores)) + geom_boxplot()

ggplot(dat, aes(reorder(names, scores, median), scores)) + geom_boxplot()

Finally, if you want to flip the coordinate axes, you can use coord_flip(). There can be some additional problems with this if you are doing even more sophisticated things, but for basic boxplots it works.

ggplot(dat, aes(reorder(names, scores, median), scores)) + 
  geom_boxplot() + coord_flip()

ggplot(dat, aes(reorder(names, scores, median), scores)) + geom_boxplot() + coord_flip()

like image 157
Brian Diggs Avatar answered Nov 20 '22 00:11

Brian Diggs


In case anyone else arrives here wondering why they're seeing

Warning message:

position_dodge requires non-overlapping x intervals

Why this happens

The reason this happens is because some of the boxplot / violin plot (or other plot type) are possibly overlapping. In many cases, you may not care, but in some cases, it matters, hence why it warns you.

How to fix it

You have two options. Either suppress warnings when generating/printing the ggplot

The other option, simply alter the width of the plot so that the plots don't overlap, then the warning goes away. Try altering the width argument to the geom: e.g. geom_boxplot(width = 0.5) (same works for geom_violin())

like image 38
stevec Avatar answered Nov 20 '22 00:11

stevec


In addition to @stevec's options, if you're seeing

  • position_stack requires non-overlapping x intervals
  • position_fill requires non-overlapping x intervals
  • position_dodge requires non-overlapping x intervals
  • position_dodge2 requires non-overlapping x intervals

and if your x variable is supposed to overlap for different aesthetics such as fill, you can try making the x_var into a factor:

geom_bar(aes(x = factor(x_var), fill = type)

like image 20
Arthur Yip Avatar answered Nov 20 '22 00:11

Arthur Yip