Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fill in stat_summary_2d ggplot

Tags:

r

ggplot2

I have a very basic ggplot question. I am wondering if there is a way I could have ggplot automatically fill in all the empty squares in this plot with squares representing zeros.

enter image description here

An example of code of what I wanted

Matrix <- as.data.frame( cbind(rnorm(10), rnorm(10), rnorm(10)))
d <- ggplot(Matrix, aes( z = V3, V1, V2)) 
d +  stat_summary_2d(bins = 10)

I am also aware that there exists the geom_raster function but that gave me:

enter image description here

Instead of the smooth surfaces like:

enter image description here

That I was expecting from the ggplot docs.

The code for the raster was:

ggplot(Matrix, aes(V1, V2)) + 
geom_raster(aes(fill = V3))
like image 668
Anonymous Emu Avatar asked Jan 27 '26 05:01

Anonymous Emu


1 Answers

For the nice "smooth surfaces" that you see, ggplot2 expects that you are going to give it a bunch of plotted points and it will calculate the density using the MASS::kde2d() kernel density function. What I think you are looking to do is use a dataset where the values of each point are known and you want to smooth out the colors of the spaces in between.

I have never found how to do this using the ggplot2 functions on their own. Because you are assuming that there is a spatial relationship between the points (the closer the more intense, the further, the less intense), this problem will likely be best solved using a package that focuses on spatial interpolation and IDW (inverse distance weighting). This chapter explains how to do it and how it works: https://mgimond.github.io/Spatial/interpolation-in-r.html

I believe the reason geom_raster() is producing small points is because the points are not at evenly spaced intervals on the x and y axes. If I understand the code correctly, the function looks for the smallest common denominator and create pixel tile sizes to match. Yours are very uneven and so the tiles are small. stat_summary_2d(bins = 5, fun = "sum") will get you closer but it will not interpolate the space in between.

As a workaround, if your dataset is not huge and your z value can be made into integers, you can create a point for each value of z using tidyr::uncount(). I wouldn't use this for a very large dataset but in a pinch, it will do what you want.

df <-
  tibble(
    x = rnorm(10),
    y = rnorm(10),
    z = floor(rnorm(10)+10)
  ) %>% 
  uncount(z)

ggplot(df, aes(x, y)) +
  stat_density_2d(
    aes(fill = stat(density)), 
    geom = "raster", contour = FALSE
  ) +
  geom_jitter(width = 0.1, height = 0.1) 

enter image description here

like image 159
yake84 Avatar answered Jan 28 '26 17:01

yake84