Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

geosphere distHaversine() & dplyr - error wrong length for vector, should be 2

Tags:

r

dplyr

geosphere

I am unable to resolve the error: "wrong length for vector, should be 2" when trying to calculate the (runway length) distance between two points (runway thresholds / ends). To make things worse, I fail to understand answers like here R error: Wrong length for a vector, should be 2 and apply them to my case. A simplified data frame of (runway end) positions looks like this:

runways <-  data.frame(
 RWY_ID = c(1,2,3)
,RWY    = c("36R", "36L","01")
,LAT    = c(40.08, 40.12, 40.06)
,LON    = c(116.59, 116.57, 116.62)
,LAT2   = c(40.05, 40.07,40.09)
,LON2   = c(116.6, 116.57, 116.61)
)

Using the distHaversine() function from geosphere, I try to calculate the distance:

runways <- mutate(runways
                 , CTD = distHaversine( c(LON, LAT), c(LON2, LAT2))
                 )

I am not sure what I do wrong here. If I pull out the LON LAT position, it is a numerical vector with the right length.

myv <- c(runways$LON[1], runways$LAT[1])
myv

[1] 116.59  40.08
str(myv)
num [1:2] 116.6 40.1
like image 481
Rainer Avatar asked Nov 11 '16 19:11

Rainer


Video Answer


1 Answers

You need to operate rowwise, so distHaversine is passed a single set of pairs at once instead of all the rows:

runways %>% rowwise() %>% 
    mutate(CTD = distHaversine(c(LON, LAT), c(LON2, LAT2)))

## Source: local data frame [3 x 7]
## Groups: <by row>
## 
## # A tibble: 3 × 7
##   RWY_ID    RWY   LAT    LON  LAT2   LON2      CTD
##    <dbl> <fctr> <dbl>  <dbl> <dbl>  <dbl>    <dbl>
## 1      1    36R 40.08 116.59 40.05 116.60 3446.540
## 2      2    36L 40.12 116.57 40.07 116.57 5565.975
## 3      3     01 40.06 116.62 40.09 116.61 3446.509

Alternatively, distHaversine can handle matrices, so you can use cbind instead of c:

runways %>% mutate(CTD = distHaversine(cbind(LON, LAT), cbind(LON2, LAT2)))

##   RWY_ID RWY   LAT    LON  LAT2   LON2      CTD
## 1      1 36R 40.08 116.59 40.05 116.60 3446.540
## 2      2 36L 40.12 116.57 40.07 116.57 5565.975
## 3      3  01 40.06 116.62 40.09 116.61 3446.509

At scale, the latter approach is almost certainly better, as operating rowwise doesn't take advantage of vectorization and can therefore get slow.

like image 185
alistaire Avatar answered Sep 20 '22 14:09

alistaire