Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best method of spatial interpolation for geographic heat/contour maps?

I'd like to use something like ggplot2 and ggmap to produce a heat map of arbitrary values such as property prices per metre squared over a geographic area at a street level (with a high resolution).

Unfortunately, the task appears to be rather difficult because while ggplot2 can produce a great density plot, it seems unable to visualise spatial data like this without prior interpolation.

For this, I've used libraries akima (gridded bivariate interpolation for irregular data) and mgcv (generalised additive models with integrated smoothness estimation), however my knowledge of interpolation methods is mediocre at best and the results I've been able to produce aren't satisfactory enough.

Consider the following example:

Data

library(ggplot2)
library(ggmap)

## data simulation
set.seed(1945)

df <- tibble(x = rnorm(500, -0.7406, 0.03),
             y = rnorm(500, 51.9976, 0.03),
             z = abs(rnorm(500, 2000, 1000)))

Map, scatterplot, density plot

## ggmap
map <- get_map("Bletchley Park, Bletchley, Milton Keynes", zoom = 13, source = "stamen", maptype = "toner-background")
q <- ggmap(map, extent = "device", darken = .5)

## scatterplot over map
q + geom_point(aes(x, y), data = df, colour = z)

## classic density heat map
q + 
  stat_density2d(aes(x=x, y=y, fill=..level..), data=df, geom="polygon", alpha = .2) + 
  geom_density_2d(aes(x=x, y=y), data=df, colour = "white", alpha = .4) +
  scale_fill_distiller(palette = "Spectral")

As you can see, the data are rather dense over the chosen area and the density heat map looks great with round edges and closed curves (except for some of the outermost layers).

density plot

Interpolation and plotting using akima

## akima interpolation
library(akima)

df_akima <-interp2xyz(interp(x=df$x, y=df$y, z=df$z, duplicate="mean", linear = T,
                             xo=seq(min(df$x), max(df$x), length=200),
                             yo=seq(min(df$y), max(df$y), length=200)), data.frame=TRUE)

## akima plot
q +
  geom_tile(aes(x = x, y = y, fill = z), data = df_akima, alpha = .4) +
  stat_contour(aes(x = x, y = y, z = z, fill = ..level..), data = df_akima, geom = 'polygon', alpha = .4) +
  geom_contour(aes(x = x, y = y, z = z), data = df_akima, colour = 'white', alpha = .4) +
  scale_fill_distiller(palette = "Spectral", na.value = NA)

This produces a dense grid of interpolated values (to ensure a sufficient resolution) and while the tile plot underneath is acceptable, the contour plots are too ragged and many of the curves aren't closed.

linear akima

Non-linear interpolation using linear = F is smoother, but apparently sacrifices resolution and goes wild with the numbers (negative values of z).

non-linear akima

Interpolation and plotting using mgcv

## mgcv interpolation
library(mgcv)

gam <- gam(z ~ s(x, y, bs = 'sos'), data = df)
df_mgcv <- data.frame(expand.grid(x = seq(min(df$x), max(df$x), length=200),
                                  y = seq(min(df$y), max(df$y), length=200)))
resp <- predict(gam, df_mgcv, type = "response")
df_mgcv$z <- resp

## mgcv plot
q +
  geom_tile(aes(x = x, y = y, fill = z), data = df_mgcv, alpha = .4) +
  stat_contour(aes(x = x, y = y, z = z, fill = ..level..), data = df_mgcv, geom = 'polygon', alpha = .4) +
  geom_contour(aes(x = x, y = y, z = z), data = df_mgcv, colour = 'white', alpha = .4) +
  scale_fill_distiller(palette = "Spectral", na.value = NA)

The same process using mgcv results in a nice and smooth plot, but the resolution is much lower and practically all curves aren't closed.

mgcv

Questions

  1. Could you please suggest a better method or modify my attempt to obtain a plot similar to the first one (clean, connected, and smooth lines with high resolution)?

  2. Is it possible to close the curves, e.g. in the last plot (the shaded area should be computed beyond the image boundaries)?

Thank you for your time!

like image 667
Harold Cavendish Avatar asked May 25 '18 16:05

Harold Cavendish


People also ask

Which is a method of spatial interpolation?

Kriging is a geostatistical method for spatial interpolation. Kriging can assess the quality of prediction with estimated prediction errors.

What is interpolation method in geography?

Spatial interpolation is a widely applied method in geographical research. It is a technique which uses sample values of known geographical points (or area units) to estimate (or predict) values at other unknown points (or area units).

Which is an example of spatial interpolation?

Spatial interpolation is the process of using points with known values to estimate values at other unknown points. For example, to make a precipitation (rainfall) map for your country, you will not find enough evenly spread weather stations to cover the entire region.

How can spatial interpolation method help and be practices in surveying field?

In another words, interpolation predicts values for cells in a raster from a limited number of sample data points. It can be used to predict unknown values for any geographic point data elevation, rainfall, temperature, chemical dispersion, noise level or other spatially-based phenomena.


1 Answers

The problem with your maps is not the interpolation method you're using, but the way ggplot displays density lines. Here's an answer to this: Remove gaps in a stat_density2d ggplot chart without modifying XY limits.

The density lines go beyond the map, so any polygon that goes outside the plot area is rendered inappropriately (ggplot will close the polygon using the next point of the correspondent level). This does not show up much on your first map because the interpolation resolution is low.

The trick proposed by Andrew is to first expand the plot area, so that the density lines are rendered correctly, then cut off the display area to hide the extra space. Since I tested his solution with your first example, here's the code:

q + 
  stat_density2d(
    aes(x = x, y = y, fill = ..level..),
    data = df,
    geom = "polygon",
    alpha = .2,
    color = "white",
    bins = 20
  ) + 
  scale_fill_distiller(
    palette = "Spectral"
  ) +
  xlim(
    min(df$x) - 10^-5,
    max(df$x) + 10^-5
  ) +
  ylim(
    min(df$y) - 10^-3,
    max(df$y) + 10^-3
  ) +
  coord_equal(
    expand = FALSE,
    xlim = c(-.778, -.688),
    ylim = c(51.965, 52.03)
  )

The only differences is that I used min()- / max() + instead of fixed numbers and coord_equal to ensure the map wasn't distorted. In addition, I manually specified a greater number of levels (using bin), since by increasing the plot area, stat_density automatically chooses a lower resolution.

As for the best interpolation method, this depends on your objective and the type of data you have. The question is not what is the best method for your map, but what is the best method for your data. This is a very broad issue, out of scope for this space. But here's a good guide: http://www.rspatial.org/analysis/rst/4-interpolation.html

For general ideas on how to make good maps in R using ggplot: http://spatial.ly/r/

like image 69
Carlos Eduardo Lagosta Avatar answered Oct 21 '22 13:10

Carlos Eduardo Lagosta