Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate total miles traveled from vectors of lat / lon

Tags:

r

geolocation

I have a data frame with data about a driver and the route they followed. I'm trying to figure out the total mileage traveled. I'm using the geosphere package but can't figure out the correct way to apply it and get an answer in miles.

> head(df1)
  id       routeDateTime driverId      lat       lon
1  1 2012-11-12 02:08:41      123 76.57169 -110.8070
2  2 2012-11-12 02:09:41      123 76.44325 -110.7525
3  3 2012-11-12 02:10:41      123 76.90897 -110.8613
4  4 2012-11-12 03:18:41      123 76.11152 -110.2037
5  5 2012-11-12 03:19:41      123 76.29013 -110.3838
6  6 2012-11-12 03:20:41      123 76.15544 -110.4506

so far I've tried

spDists(cbind(df1$lon,df1$lat))

and several other functions but can't seem to get a reasonable answer.

Any suggestions?

> dput(df1)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40), routeDateTime = c("2012-11-12 02:08:41", 
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", 
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", 
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41", 
"2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", 
"2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", 
"2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", 
"2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41", 
"2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", 
"2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", 
"2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41", 
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", 
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", 
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41"
), driverId = c(123, 123, 123, 123, 123, 123, 123, 123, 123, 
123, 456, 456, 456, 456, 456, 456, 456, 456, 456, 456, 789, 789, 
789, 789, 789, 789, 789, 789, 789, 789, 246, 246, 246, 246, 246, 
246, 246, 246, 246, 246), lat = c(76.5716897079255, 76.4432530414779, 
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 
76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779, 
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 
76.3357809444424, 76.032417796785), lon = c(-110.80701574916, 
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, 
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, 
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, 
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, 
-110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916, 
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, 
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, 
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, 
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, 
-110.556956546381, -110.24483308522, -110.217355202651)), .Names = c("id", 
"routeDateTime", "driverId", "lat", "lon"), row.names = c(NA, 
-40L), class = "data.frame")
like image 379
screechOwl Avatar asked Dec 21 '12 19:12

screechOwl


2 Answers

How about this?

## Setup
library(geosphere)
metersPerMile <- 1609.34
pts <- df1[c("lon", "lat")]

## Pass in two derived data.frames that are lagged by one point
segDists <- distVincentyEllipsoid(p1 = pts[-nrow(df),], 
                                  p2 = pts[-1,])
sum(segDists)/metersPerMile
# [1] 1013.919

(To use one of the faster distance calculation algorithms, just substitute distCosine, distVincentySphere, or distHaversine for distVincentyEllipsoid in the call above.)

like image 197
Josh O'Brien Avatar answered Sep 30 '22 00:09

Josh O'Brien


Be VERY careful with missing data, as distVincentyEllipsoid() returns 0 for distance between any two points with missing coordinates c(NA, NA), c(NA, NA).

like image 35
sluque Avatar answered Sep 30 '22 02:09

sluque