Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2 make missing value in geom_tile not blank

Tags:

r

ggplot2

In the geom_tile() layer in the ggplot2 data visualization package for R, when a cell contains no data it is not drawn. E.g. http://docs.ggplot2.org/current/geom_tile.html and search for "missing value".

I would like to change this behavior to show the minimum value over all the tiles. Is this possible and if so how?

Additional context: when I use

stat_density2d(aes(x=x,y=y, fill=..density..), geom="tile", contour=FALSE)

I would like the regions with no density to look very similar to the regions with very little density. As it is now, if say the color spectrum is from blue to red and the background is white, then there when there is no data in a tile it is white and when there is a single data point in a tile is blue.

Adding a pseudo count to the data seems possible, but how do I know in advance how to distribute the pseudo-counts? and in the case when there are faceting?

like image 258
momeara Avatar asked Aug 02 '11 02:08

momeara


4 Answers

If your data is a grid-like data, how about adding another geom_tile() for NA by subset()?

# Generate data
pp <- function (n, r = 4) {
  x    <- seq(-r*pi, r*pi, len = n)
  df   <- expand.grid(x = x, y = x)
  df$r <- sqrt(df$x^2 + df$y^2)
  df$z <- cos(df$r^2)*exp(-df$r/6)
  df
}
pp20 <- pp(20)[sample(20*20, size = 200),]

df_grid  <- expand.grid(x = unique(pp20$x), y = unique(pp20$x))
df_merge <- merge(pp20, df_grid, by = c("x", "y"), all = TRUE)

# Missing values
ggplot(df_merge, aes(x = x, y = y)) +
  geom_tile(data = subset(df_merge, !is.na(z)), aes(fill = z)) +
  geom_tile(data = subset(df_merge,  is.na(z)), aes(colour = NA),
    linetype = 0, fill = "pink", alpha = 0.5)

an example

like image 177
Triad sou. Avatar answered Nov 09 '22 19:11

Triad sou.


This issue can also be fixed by an option in scale_fill_continuous

scale_fill_continuous(na.value = 'salmon')

Edit below:

This only fills in the explicitly (i.e. values which are NA) missing values. (It may have worked differently in previous versions of ggplot, I'm too lazy to check)

See the following code for an example:

library(tidyverse)
Data <- expand.grid(x = 1:5,y=1:5) %>%
  mutate(Value = rnorm(25))

Data %>%
  filter(y!=3) %>%
ggplot(aes(x=x,y=y,fill=Value))+
  geom_tile()+
  scale_fill_continuous(na.value = 'salmon')

Data %>%
  mutate(Value=ifelse(1:n() %in% sample(1:n(),22),NA,Value)) %>%
  ggplot(aes(x=x,y=y,fill=Value))+
  geom_tile()+
  scale_fill_continuous(na.value = 'salmon')

An easy fix for this is to use the complete function to make the missing values explicit.

Data %>%
  filter(1:n() %in% sample(1:n(),22)) %>%
  complete(x,y) %>%
  ggplot(aes(x=x,y=y,fill=Value))+
  geom_tile()+
  scale_fill_continuous(na.value = 'salmon')

In some cases the expand function may be more useful than the complete function.

like image 22
Bishops_Guest Avatar answered Nov 09 '22 18:11

Bishops_Guest


For posterity, here is the right solution compatible with ggplot2 version 1.9.3

+ theme(panel.background=element_rect(fill="blue", colour="blue")
  • In joran's answer, the plot.background is the whole plot including the title and legend etc. The panel.background is the area where the data appears.

  • In the latest version of ggplot2, opts has been replaced with theme and theme_rect has been replaced with element_rect.

  • In specifying element_rect, color is the boundary of the rectangle while fill is the interior of the rectangle.

I had originally used,

+ geom_rect(aes(xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf), fill="blue")

but when adding geom_raster rather than geom_tile over the the background and generate pdf output, pdf viewers had a very hard time rendering the plot, using substantially more cpu cycles and memory.

like image 27
momeara Avatar answered Nov 09 '22 17:11

momeara


This answer may perhaps be a bit too 'cute', but could one solution be to simply change the background color of your plot to be the minimum color in your scale? For instance:

+ opts(plot.background = theme_rect(colour = "blue")

If your plot has a more complex structure and this ends up making the background blue in areas where you don't want that to happen, you could plot a geom_rect layer first that extends to through the ranges of your data only.

like image 44
joran Avatar answered Nov 09 '22 17:11

joran