Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distances of points between rows with sf

Tags:

r

dplyr

sf

I have multiple trajectories saved in simple feature (sf) of the type POINT. I'd like to calculate the Euclidean distances between subsequent locations (i.e. rows). Until now, I've "manually" calculated distances using the Pythagorean formula for calculating Euclidean Distances in 2D space. I was wondering if I could do the same using the function sf::st_distance(). Here's a quick example:

library(sf)
library(dplyr)

set.seed(1)

df <- data.frame(
  gr = c(rep("a",5),rep("b",5)),
  x  = rnorm(10),
  y = rnorm(10)
 )

df <- st_as_sf(df,coords = c("x","y"),remove = F)


df %>%
  group_by(gr) %>%
  mutate(
    dist = sqrt((lead(x)-x)^2+(lead(y)-y)^2)
  )
#> Simple feature collection with 10 features and 4 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: -0.8356286 ymin: -2.2147 xmax: 1.595281 ymax: 1.511781
#> epsg (SRID):    NA
#> proj4string:    NA
#> # A tibble: 10 x 5
#> # Groups:   gr [2]
#>    gr         x       y   dist                 geometry
#>    <fct>  <dbl>   <dbl>  <dbl>                  <POINT>
#>  1 a     -0.626  1.51    1.38     (-0.6264538 1.511781)
#>  2 a      0.184  0.390   1.44     (0.1836433 0.3898432)
#>  3 a     -0.836 -0.621   2.91   (-0.8356286 -0.6212406)
#>  4 a      1.60  -2.21    3.57        (1.595281 -2.2147)
#>  5 a      0.330  1.12   NA         (0.3295078 1.124931)
#>  6 b     -0.820 -0.0449  1.31  (-0.8204684 -0.04493361)
#>  7 b      0.487 -0.0162  0.992  (0.4874291 -0.01619026)
#>  8 b      0.738  0.944   0.204    (0.7383247 0.9438362)
#>  9 b      0.576  0.821   0.910    (0.5757814 0.8212212)
#> 10 b     -0.305  0.594  NA       (-0.3053884 0.5939013)

I would like to calculate dist with sf::st_distance(). How would I go about this?

like image 491
Ratnanil Avatar asked Apr 16 '18 09:04

Ratnanil


Video Answer


1 Answers

The first thing to know about sf is that the geometry column (the one of class sfc) is stored as a list-column inside the dataframe. The key to usually do anything with a list-column is to either use purrr::map and friends or to use a function that accepts list-cols as arguments. In the case of st_distance its arguments can be an object of sf (a dataframe), sfc (the geometry column), or even an sfg (a single geom row), so there's no need for map and friends. The solution should look something like this:

df %>%
  group_by(gr) %>%
  mutate(
    dist = st_distance(geometry)
  )

However, this doesn't work. After some investigating, we find two problems. First, st_distance returns a distance matrix and not a single value. To solve this, we make use of the by_element = T argument of st_distance.

Next, we can't just do dist = st_distance(geometry, lead(geometry), by_element = T) because lead only works on vector columns, not list columns.

To solve this second problem, we create the lead column ourselves using geometry[row_number() + 1].

Here's the full solution:

library(sf)
library(dplyr)

df %>%
  group_by(gr) %>%
  mutate(
    lead = geometry[row_number() + 1],
    dist = st_distance(geometry, lead, by_element = T),
  )
#> Simple feature collection with 10 features and 4 fields
#> Active geometry column: geometry
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: -0.8356286 ymin: -2.2147 xmax: 1.595281 ymax: 1.511781
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 10 x 6
#> # Groups:   gr [2]
#>    gr         x       y  dist                       geometry
#>    <fct>  <dbl>   <dbl> <dbl>         <sf_geometry [degree]>
#>  1 a     -0.626  1.51   1.38     POINT (-0.6264538 1.511781)
#>  2 a      0.184  0.390  1.44     POINT (0.1836433 0.3898432)
#>  3 a     -0.836 -0.621  2.91   POINT (-0.8356286 -0.6212406)
#>  4 a      1.60  -2.21   3.57        POINT (1.595281 -2.2147)
#>  5 a      0.330  1.12   0         POINT (0.3295078 1.124931)
#>  6 b     -0.820 -0.0449 1.31  POINT (-0.8204684 -0.04493361)
#>  7 b      0.487 -0.0162 0.992  POINT (0.4874291 -0.01619026)
#>  8 b      0.738  0.944  0.204    POINT (0.7383247 0.9438362)
#>  9 b      0.576  0.821  0.910    POINT (0.5757814 0.8212212)
#> 10 b     -0.305  0.594  0       POINT (-0.3053884 0.5939013)
#> # ... with 1 more variable: lead <sf_geometry [degree]>
like image 173
yeedle Avatar answered Sep 19 '22 05:09

yeedle