Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make geom_area() leave a gap for missing values?

Tags:

r

ggplot2

When I plot using geom_area() I expect it to perform a lot like geom_bar(), but I'm a little perplexed by this behavior for missing values.

    require(dplyr)
    require(ggplot2)

    set.seed(1)

    test <- data.frame(x=rep(1:10,3), y=abs(rnorm(30)), z=rep(LETTERS[1:3],10)) %>% arrange(x,z) 

# I also have no idea why geom_area needs the data.frame to be sorted first.

    test[test$x==4,"y"] <- NA

    ggplot(test, aes(x, y, fill=z)) + geom_bar(stat="identity", position="stack") 

Produces this stacked bar chart. Graph using stack_bar()

However, if I change to stack_area() it interpolates across the missing values.

> ggplot(test, aes(x, y, fill=z)) + geom_area(stat="identity", position="stack")
Warning message:
Removed 3 rows containing missing values (position_stack). 

Graph using stack_area()

If I add in na.rm=FALSE or na.rm=TRUE it makes no difference.

ggplot(test, aes(x, y, fill=z)) + geom_area(stat="identity", position="stack", na.rm=TRUE) Warning message: Removed 3 rows containing missing values (position_stack)

Graph with na.rm=TRUE

ggplot(test, aes(x, y, fill=z)) + geom_area(stat="identity", position="stack", na.rm=FALSE) Warning message: Removed 3 rows containing missing values (position_stack).

Graph with na.rm=FALSE

Obviously, whatever I'm trying isn't working. How can I show a gap in the series with stack_area()?

like image 982
Tom Avatar asked May 08 '15 05:05

Tom


1 Answers

It seems that the problem has to do with how the values are stacked. The error message tells you that the rows containing missing values were removed, so there is simply no gap present in the data that your are plotting.

However, geom_ribbon, of which geom_area is a special case, leaves gaps for missing values. geom_ribbon plots an area as well, but you have to specify the maximum and minimum y-values. So the trick can be done by calculating these values manually and then plotting with geom_ribbon(). Starting with your data frame test, I create the ymin and ymax data as follows:

test$ymax <-test$y
test$ymin <- 0
zl <- levels(test$z)
for ( i in 2:length(zl) ) {
   zi <- test$z==zl[i]
   zi_1 <- test$z==zl[i-1]
   test$ymin[zi] <- test$ymax[zi_1]
   test$ymax[zi] <- test$ymin[zi] + test$ymax[zi]
}

and then plot with geom_ribbon:

ggplot(test, aes(x=x,ymax=ymax,ymin=ymin, fill=z)) + geom_ribbon()

This gives the following plot:

enter image description here

like image 65
Stibu Avatar answered Sep 28 '22 11:09

Stibu