Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Consistent box width with missing values in a ggplot box plot [duplicate]

Tags:

plot

r

na

ggplot2

In ggplot2, I want my boxes in a box plot to be of equal width even when a given level combination does not exist.

For example, in mtcars, cyl=8 and gear=4 does not exist, which leads to larger bars in this plot:

qplot(data=mtcars, x=as.factor(cyl), y=mpg,
      colour=as.factor(gear), geom="boxplot")

enter image description here

For a bar plot, padding our data frame with NA values for these level combinations would solve the problem, but not for box plot:

mtcars.fill <- data.frame(cyl=8,gear=4,mpg=NA)
mtcars <- rbind.fill(mtcars,mtcars.fill)

qplot(data=mtcars, x=as.factor(cyl), y=mpg, colour=as.factor(gear), geom="boxplot")

Warning message:
Removed 1 rows containing non-finite values (stat_boxplot). 

Which leads to the exact same plot.

stat_boxplot has an argument for na values, but it is set to not remove NAs by default:

na.rm = FALSE
like image 876
Etienne Low-Décarie Avatar asked Nov 12 '22 19:11

Etienne Low-Décarie


1 Answers

The best I can offer is a work-around using facet_grid(). This has the added benefit that points from a geom_point() layer will line up with the boxplots.

library(ggplot2)

plot1 = ggplot(mtcars, aes(x=factor(gear), y=mpg, colour=factor(gear))) +
        geom_boxplot(space=0) + 
        facet_grid(. ~ cyl, labeller="label_both")

plot2 = plot1 + geom_point()

library(gridExtra)
ggsave(filename="plots.png", plot=arrangeGrob(plot1, plot2, ncol=2), 
       width=10, height=4, dpi=120)

enter image description here

like image 124
bdemarest Avatar answered Nov 15 '22 06:11

bdemarest