I have multiple trajectories saved in simple feature (sf
) of the type POINT
. I'd like to calculate the Euclidean distances between subsequent locations (i.e. rows). Until now, I've "manually" calculated distances using the Pythagorean formula for calculating Euclidean Distances in 2D space. I was wondering if I could do the same using the function sf::st_distance()
. Here's a quick example:
library(sf)
library(dplyr)
set.seed(1)
df <- data.frame(
gr = c(rep("a",5),rep("b",5)),
x = rnorm(10),
y = rnorm(10)
)
df <- st_as_sf(df,coords = c("x","y"),remove = F)
df %>%
group_by(gr) %>%
mutate(
dist = sqrt((lead(x)-x)^2+(lead(y)-y)^2)
)
#> Simple feature collection with 10 features and 4 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -0.8356286 ymin: -2.2147 xmax: 1.595281 ymax: 1.511781
#> epsg (SRID): NA
#> proj4string: NA
#> # A tibble: 10 x 5
#> # Groups: gr [2]
#> gr x y dist geometry
#> <fct> <dbl> <dbl> <dbl> <POINT>
#> 1 a -0.626 1.51 1.38 (-0.6264538 1.511781)
#> 2 a 0.184 0.390 1.44 (0.1836433 0.3898432)
#> 3 a -0.836 -0.621 2.91 (-0.8356286 -0.6212406)
#> 4 a 1.60 -2.21 3.57 (1.595281 -2.2147)
#> 5 a 0.330 1.12 NA (0.3295078 1.124931)
#> 6 b -0.820 -0.0449 1.31 (-0.8204684 -0.04493361)
#> 7 b 0.487 -0.0162 0.992 (0.4874291 -0.01619026)
#> 8 b 0.738 0.944 0.204 (0.7383247 0.9438362)
#> 9 b 0.576 0.821 0.910 (0.5757814 0.8212212)
#> 10 b -0.305 0.594 NA (-0.3053884 0.5939013)
I would like to calculate dist
with sf::st_distance()
. How would I go about this?
The first thing to know about sf
is that the geometry column (the one of class sfc
) is stored as a list-column inside the dataframe. The key to usually do anything with a list-column is to either use purrr::map
and friends or to use a function that accepts list-cols as arguments. In the case of st_distance
its arguments can be an object of sf
(a dataframe), sfc
(the geometry column), or even an sfg
(a single geom row), so there's no need for map
and friends. The solution should look something like this:
df %>%
group_by(gr) %>%
mutate(
dist = st_distance(geometry)
)
However, this doesn't work. After some investigating, we find two problems. First, st_distance
returns a distance matrix and not a single value. To solve this, we make use of the by_element = T
argument of st_distance
.
Next, we can't just do dist = st_distance(geometry, lead(geometry), by_element = T)
because lead
only works on vector columns, not list columns.
To solve this second problem, we create the lead column ourselves using geometry[row_number() + 1]
.
Here's the full solution:
library(sf)
library(dplyr)
df %>%
group_by(gr) %>%
mutate(
lead = geometry[row_number() + 1],
dist = st_distance(geometry, lead, by_element = T),
)
#> Simple feature collection with 10 features and 4 fields
#> Active geometry column: geometry
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -0.8356286 ymin: -2.2147 xmax: 1.595281 ymax: 1.511781
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 10 x 6
#> # Groups: gr [2]
#> gr x y dist geometry
#> <fct> <dbl> <dbl> <dbl> <sf_geometry [degree]>
#> 1 a -0.626 1.51 1.38 POINT (-0.6264538 1.511781)
#> 2 a 0.184 0.390 1.44 POINT (0.1836433 0.3898432)
#> 3 a -0.836 -0.621 2.91 POINT (-0.8356286 -0.6212406)
#> 4 a 1.60 -2.21 3.57 POINT (1.595281 -2.2147)
#> 5 a 0.330 1.12 0 POINT (0.3295078 1.124931)
#> 6 b -0.820 -0.0449 1.31 POINT (-0.8204684 -0.04493361)
#> 7 b 0.487 -0.0162 0.992 POINT (0.4874291 -0.01619026)
#> 8 b 0.738 0.944 0.204 POINT (0.7383247 0.9438362)
#> 9 b 0.576 0.821 0.910 POINT (0.5757814 0.8212212)
#> 10 b -0.305 0.594 0 POINT (-0.3053884 0.5939013)
#> # ... with 1 more variable: lead <sf_geometry [degree]>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With