The new-ish sf
package for R makes it really easy to deal with
geographic data in R, and the develompent version of ggplot2
has a new
geom_sf()
layer for plotting sf-style geographic data.
Within the sf
paradigm of working with data, is it possible to map
ggplot aestheics to LINESTRING
geometries?
For instance, with standard ggplot, it's possible to recreate Minard's famous plot of survivors from Napoleon's Grande Armée in 1812 with ggplot and this data, sizing the path of the army by the number of survivors:
# Install the dev version of ggplot2 for geom_sf()
# devtools::install_github("tidyverse/ggplot2")
library(tidyverse)
troops <- read_csv("https://gist.githubusercontent.com/andrewheiss/69b9dffb7cca392eb7f9bdf56789140f/raw/3e2a48635ae44837955765b5e7747c429b0b5d71/troops.csv")
ggplot(troops) +
geom_path(aes(x = long, y = lat, color = direction,
group = group, size = survivors),
lineend = "round")
We can work with this troops data as an sf
object by creating a new
geometry
column, like so:
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, proj.4 4.9.3
troops_with_geometry <- troops %>%
st_as_sf(coords = c("long", "lat"))
head(troops_with_geometry)
#> Simple feature collection with 6 features and 3 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 24 ymin: 54.5 xmax: 28 ymax: 55
#> epsg (SRID): NA
#> proj4string: NA
#> # A tibble: 6 x 4
#> survivors direction group geometry
#> <int> <chr> <int> <simple_feature>
#> 1 340000 A 1 <POINT (24 54.9)>
#> 2 340000 A 1 <POINT (24.5 55)>
#> 3 340000 A 1 <POINT (25.5 ...>
#> 4 320000 A 1 <POINT (26 54.7)>
#> 5 300000 A 1 <POINT (27 54.8)>
#> 6 280000 A 1 <POINT (28 54.9)>
If we plot this with geom_sf
, ggplot will use points:
ggplot(troops_with_geometry) +
geom_sf(aes(color = direction, group = group))
We can create line strings for each of the groups and directions by grouping, summarizing, and casting.
troops_lines <- troops_with_geometry %>%
group_by(direction, group) %>%
summarize() %>%
st_cast("LINESTRING")
head(troops_lines)
#> Simple feature collection with 6 features and 2 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: 24 ymin: 54.1 xmax: 37.7 ymax: 55.8
#> epsg (SRID): NA
#> proj4string: NA
#> direction group geometry
#> 1 A 1 LINESTRING (24 54.9, 24.5 5...
#> 2 A 2 LINESTRING (24 55.1, 24.5 5...
#> 3 A 3 LINESTRING (24 55.2, 24.5 5...
#> 4 R 1 LINESTRING (24.1 54.4, 24.2...
#> 5 R 2 LINESTRING (28.3 54.2, 28.5...
#> 6 R 3 LINESTRING (24.1 54.4, 24.2...
ggplot can then plot these six connected lines and color them correctly:
ggplot(troops_lines) +
geom_sf(aes(color = direction, group = group))
However, the survivors data is now gone and there's no way to map size aesthetics to the new lines.
Is there a way to associate other aestheics (like size) to sf
-based
LINESTRING
data? Or, in other words, is there a way to recreate
ggplot(...) + geom_path(aes(x = long, y = lat, size = something))
using geom_sf()
and the sf paradigm of working with geographic data?
You need to create a linestring from each pair of points, within each group. The result is not as pretty because I don't know how to give the lines round endpoints.
# within each group repeat each point
# then slice the first and last out and
# add a variable called linegroup, which provides grouping for start and endpoints of each line
troops %<>% group_by(group) %>%
slice(rep(1:n(), each = 2)) %>%
slice(-c(1, n())) %>%
mutate(linegroup = lapply(1:(n()/2), function(x) rep(x, 2)) %>% unlist) %>%
ungroup
# create linestring sf object by summarizing the points,
# grab the last survivor and direction value of each group (i.e. the 'endpoint' value)
troops_line <- st_as_sf(troops, coords = c("long", "lat"), crs = 4326) %>%
group_by(group, linegroup) %>%
summarise(survivors = last(survivors), direction = last(direction), do_union = FALSE) %>%
st_cast("LINESTRING")
gp <- ggplot(troops_line) +
geom_sf(aes(color = direction, size = survivors), show.legend = "line")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With