Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

geom_raster interpolation with log scale

I'm a bit stuck plotting a raster with a log scale. Consider this plot for example:

ggplot(faithfuld, aes(waiting, eruptions)) +
 geom_raster(aes(fill = density))

enter image description here

But how to use a log scale with this geom? None of the usual methods are very satisfying:

 ggplot(faithfuld, aes(waiting, log10(eruptions))) +
   geom_raster(aes(fill = density))

enter image description here

 ggplot(faithfuld, aes(waiting, (eruptions))) +
   geom_raster(aes(fill = density)) + 
   scale_y_log10()

enter image description here

and this doesn't work at all:

 ggplot(faithfuld, aes(waiting, (eruptions))) +
   geom_raster(aes(fill = density)) + 
   coord_trans(x="log10")

Error: geom_raster only works with Cartesian coordinates

Are there any options for using a log scale with a raster?

To be precise, I have three columns of data. The z value is the one I want to use to colour the raster, and it is not computed from the x and y values. So I need to supply all three columns to the ggplot function. For example:

dat <- data.frame(x = rep(1:10, 10), 
                  y = unlist(lapply(1:10, function(i) rep(i, 10))), 
                  z = faithfuld$density[1:100])

ggplot(dat, aes(x = log(x), y = y, fill = z)) +
  geom_raster()

enter image description here

What can I do to get rid of those gaps in the raster?

Note that this question is related to these two:

  • geom_raster interpolation with log scale
  • Use R to recreate contour plot made in Igor

I have been keeping an updated gist of R code that combines details from the answers to these questions (example output included in the gist). That gist is here: https://gist.github.com/benmarwick/9a54cbd325149a8ff405

like image 634
Ben Avatar asked Mar 08 '16 11:03

Ben


1 Answers

The dataset faithfuld already have a column for density which is the estimates of the 2D density for waiting and eruptions. You can find that the eruptions and waiting in the dataset are points in a grid. When you use geom_raster, it doesn't compute the density for you. Instead, it plots the density according to the x, y coordinates, in this case, is the grid. Hence, if you just apply the log transformation on y, it will distort the difference between y (originally they are equally spaced) and this is why you see the space in your plot. I used points to visualize the effects:

library(ggplot2)
library(gridExtra)

# Use point to visualize the effect of log on the dataset
g1 <- ggplot(faithfuld, aes(x=waiting, y=eruptions)) +
  geom_point(size=0.5)    

g2 <- ggplot(faithfuld, aes(x=waiting, y=log(eruptions))) +
  geom_point(size=0.5)    

grid.arrange(g1, g2, ncol=2)    

enter image description here

If you really want to transform y to log scale and produce the density plot, you have to use the faithful dataset with geom_density_2d.

# Use geom_density_2d
ggplot(faithful, aes(x=waiting, y=log(eruptions))) +
  geom_density_2d() +
  stat_density_2d(geom="raster", aes(fill=..density..),
                  contour=FALSE)

enter image description here

Update: Use geom_rect and supply custom xmin, xmax, ymin, ymax values to fit the spaces of the log scale.

Since the geom_raster use the same size of tiles, you probably have to use geom_tile or geom_rect to create the plot. My idea is to calculate how large (width) each tile should be and adjust the xmin and xmax for each tile to fill up the gap.

 dat <- data.frame(x = rep(1:10, 10), 
                  y = unlist(lapply(1:10, function(i) rep(i, 10))), 
                  z = faithfuld$density[1:100])
library(ggplot2)
library(gridExtra)   

g <- ggplot(dat, aes(x = log(x), y = y, fill = z)) +
  geom_raster()   

# Replace the ymin and ymax
distance <- diff((unique(dat$x)))/2
upper <- (unique(dat$x)) + c(distance, distance[length(distance)])
lower <- (unique(dat$x)) - c(distance[1], distance) 

# Create xmin, xmax, ymin, ymax
dat$xmin <- dat$x - 0.5 # default of geom_raster is 0.5
dat$xmax <- dat$x + 0.5
dat$ymin <- unlist(lapply(lower, function(i) rep(i, rle(dat$y)$lengths[1])))
dat$ymax <- unlist(lapply(upper, function(i) rep(i, rle(dat$y)$lengths[1])))        

# You can also use geom_tile with the width argument
g2 <- ggplot(dat, aes(x=log(x), y=y, xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, fill=z)) +
  geom_rect() 

# show the plots     
grid.arrange(g, g2, ncol=2)

enter image description here

like image 192
JasonWang Avatar answered Oct 26 '22 19:10

JasonWang