Calculate the euclidean distance between points within grouped data

Question

In the data below (included with dput), I have repeat observations (lat and long) for three individuals (IndIDII). Note, there are a different number of locations for each individual.

> Dat
  IndIDII      IndYear  WintLat  WintLong
1 BHS_265 BHS_265-2015 47.61025 -112.7210
2 BHS_265 BHS_265-2016 47.59884 -112.7089
3 BHS_770 BHS_770-2016 42.97379 -109.0400
4 BHS_770 BHS_770-2017 42.97129 -109.0367
5 BHS_770 BHS_770-2018 42.97244 -109.0509
6 BHS_377 BHS_377-2015 43.34744 -109.4821
7 BHS_377 BHS_377-2016 43.35559 -109.4445
8 BHS_377 BHS_377-2017 43.35195 -109.4566
9 BHS_377 BHS_377-2018 43.34765 -109.4892

I would like to calculate the euclidean distance between sequential points for each individual. My initial though was to work within dplyr using lead() as shown below. The distm function requires a matrix, which I have been unable create within dplyr. Is it possible to generate and use a matrix as the argument to distm?

Dat %>% 
  group_by(IndIDII) %>% 
  mutate(WitnGeoDist = distm(as.matrix(c("WintLong", "WintLat")), lead(as.matrix(c("WintLong", "WintLat"))), fun = distVincentyEllipsoid))

Alternatively, are there other possibilities...? Many thanks in advance.

Data:

Dat <- structure(list(IndIDII = c("BHS_265", "BHS_265", "BHS_770", "BHS_770", 
"BHS_770", "BHS_377", "BHS_377", "BHS_377", "BHS_377"), IndYear = c("BHS_265-2015", 
"BHS_265-2016", "BHS_770-2016", "BHS_770-2017", "BHS_770-2018", 
"BHS_377-2015", "BHS_377-2016", "BHS_377-2017", "BHS_377-2018"
), WintLat = c(47.6102519805014, 47.5988417247191, 42.9737859090909, 
42.9712914772727, 42.9724390816327, 43.3474354347826, 43.3555934579439, 
43.3519543396226, 43.3476466990291), WintLong = c(-112.720994832869, 
-112.708887595506, -109.039964727273, -109.036693522727, -109.050923061224, 
-109.482114456522, -109.444522149533, -109.45659254717, -109.489241553398
)), class = "data.frame", row.names = c(NA, -9L))

Calum You · Accepted Answer

Here is a different method that leverages group_by better and gets geosphere::distm working by using purrr::possibly. This lets us fill in NA for the rows where the distance doesn't make sense, because there are no previous values to work from.

Dat <- structure(list(IndIDII = c("BHS_265", "BHS_265", "BHS_770", "BHS_770", "BHS_770", "BHS_377", "BHS_377", "BHS_377", "BHS_377"), IndYear = c("BHS_265-2015", "BHS_265-2016", "BHS_770-2016", "BHS_770-2017", "BHS_770-2018", "BHS_377-2015", "BHS_377-2016", "BHS_377-2017", "BHS_377-2018"), WintLat = c(47.6102519805014, 47.5988417247191, 42.9737859090909, 42.9712914772727, 42.9724390816327, 43.3474354347826, 43.3555934579439, 43.3519543396226, 43.3476466990291), WintLong = c(-112.720994832869, -112.708887595506, -109.039964727273, -109.036693522727, -109.050923061224, -109.482114456522, -109.444522149533, -109.45659254717, -109.489241553398)), class = "data.frame", row.names = c(NA, -9L))
library(tidyverse)
poss_dist <- possibly(geosphere::distm, otherwise = NA)
Dat %>%
  nest(WintLong, WintLat, .key = "coords") %>%
  group_by(IndIDII) %>%
  mutate(prev_coords = lag(coords)) %>%
  ungroup() %>%
  mutate(WitnGeoDist = map2_dbl(coords, prev_coords, poss_dist))
#> # A tibble: 9 x 5
#>   IndIDII IndYear      coords              prev_coords         WitnGeoDist
#>   <chr>   <chr>        <list>              <list>                    <dbl>
#> 1 BHS_265 BHS_265-2015 <data.frame [1 x 2~ <lgl [1]>                   NA 
#> 2 BHS_265 BHS_265-2016 <data.frame [1 x 2~ <data.frame [1 x 2~       1561.
#> 3 BHS_770 BHS_770-2016 <data.frame [1 x 2~ <lgl [1]>                   NA 
#> 4 BHS_770 BHS_770-2017 <data.frame [1 x 2~ <data.frame [1 x 2~        385.
#> 5 BHS_770 BHS_770-2018 <data.frame [1 x 2~ <data.frame [1 x 2~       1168.
#> 6 BHS_377 BHS_377-2015 <data.frame [1 x 2~ <lgl [1]>                   NA 
#> 7 BHS_377 BHS_377-2016 <data.frame [1 x 2~ <data.frame [1 x 2~       3180.
#> 8 BHS_377 BHS_377-2017 <data.frame [1 x 2~ <data.frame [1 x 2~       1059.
#> 9 BHS_377 BHS_377-2018 <data.frame [1 x 2~ <data.frame [1 x 2~       2690.

Created on 2018-09-19 by the reprex package (v0.2.0).

Calculate the euclidean distance between points within grouped data

Tags:

r

dplyr

distance

geosphere

r-sp

B. Davis

1 Answers

Calum You

Recent Activity

Donate For Us

Calculate the euclidean distance between points within grouped data

Tags:

r

dplyr

distance

geosphere

r-sp

B. Davis

1 Answers

Calum You

Related questions

Recent Activity

Donate For Us