I would like to calculate the distance along a path of GPS coordinates which are stored in two columns in a data frame.
import pandas as pd
df = pd.DataFrame({ 'lat' : [1, 2.5, 3, 1.2],
'lng' : [1, 1, 2.1, 1],
'label': ['foo', 'bar', 'zip', 'foo']})
print df
Output
label lat lng
0 foo 1.0 1.0
1 bar 2.5 1.0
2 zip 3.0 2.1
3 foo 1.2 1.0
The GPS coordinates are stored in radians. So, the distance between the first and second rows of the dataframe can be calculated as follows:
import math as m
r1 = 0
r2 = 1
distance =m.acos(m.sin(df.lat[r1]) * m.sin(df.lat[r2]) +
m.cos(df.lat[r1]) * m.cos(df.lat[r2]) * m.cos(df.lng[r2]-df.lng[r1]))*6371
I would like to repeat this calculation between every pair of consecutive rows and then add each short distance into the longer final distance for the full path.
I could put this into a loop for n-1 rows of the dataframe, but is there a more pythonic way to do this?
Vectorized Haversine
function:
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * np.arcsin(np.sqrt(a))
Solution:
df['dist'] = haversine(df['lat'], df['lng'],
df['lat'].shift(), df['lng'].shift(),
to_radians=False)
Result:
In [65]: df
Out[65]:
label lat lng dist
0 foo 1.0 1.0 NaN
1 bar 2.5 1.0 9556.500000
2 zip 3.0 2.1 7074.983158
3 foo 1.2 1.0 10206.286067
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With