I have a data frame (called coordinates) containing 3 columns: index, Latitude, Longitude - it has roughly 1,000 rows. I have the coordinates of a specific place and want to find the distance between the place and all the coordinates in the data frame. Currently, I can use geopy.distance to find the distance between two specific coordiantes. E.g.
import geopy.distance
site_coords = (38.898556, -77.037852)
place2_coords = (38.897147, -77.043934)
print(geopy.distance.vincenty(site_coords, place2_coords).km)
The above code gives 0.5503161689006362 (I have checked and this is correct)
My issue is with looping through the data frame (called coordinates) and calculating the distance for all coordinates in that data frame. Currently, this is what I have:
import geopy.distance
import pandas as pd
df = pd.read_csv('coordinates.csv', sep=',', header=None)
site_coords = (38.898556, -77.037852)
for index, row in df.iterrows():
place2_coords = df
x = geopy.distance.vincenty(site_coords, place2_coords).km
print(x)
However when i print x it prints the same distance many times and the distance is incorrect. The coordinates file looks something like (when opened in notepad) - but has many more rows:
,Latitude,Longitude
0,73.3645,-0.9015
1,73.3645,-0.3995
2,73.3645,-0.5825
So i need a way to loop through and find the distance.
For this divide the values of longitude and latitude of both the points by 180/pi. The value of pi is 22/7. The value of 180/pi is approximately 57.29577951. If we want to calculate the distance between two places in miles, use the value 3, 963, which is the radius of Earth.
Install it via pip install mpu --user and use it like this to get the haversine distance: import mpu # Point one lat1 = 52.2296756 lon1 = 21.0122287 # Point two lat2 = 52.406374 lon2 = 16.9251681 # What you were looking for dist = mpu.
The math. dist() method returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point. Note: The two points (p and q) must be of the same dimensions.
Calculating the distance between latitude lines is easy because this distance never varies. If you treat the Earth as a sphere with a circumference of 25,000 miles, then one degree of latitude is 25,000/360 = 69.44 miles.
See here a variation on the one of sechilds. The site_coords
are an input to the def. The apply function now uses 2 arguments: the row
from the DataFrame and site_coords
:
import pandas as pd
import numpy as np
import geopy.distance
def calc_distance(row, site_coords):
station_coords = (row['lat'], row['lon'])
d = geopy.distance.distance(site_coords, station_coords).km
return(d)
df['distance'] = df.apply(calc_distance, site_coords=(38.898556, -77.037852), axis=1)
If your file looks like
,Latitude,Longitude
0,73.3645,-0.9015
1,73.3645,-0.3995
2,73.3645,-0.5825
but you read with "head=None",
df = pd.read_csv('coordinates.csv', sep=',', header=None)
the first line will become a data row instead. This may be the reason why you get a "AttributeError: 'Series' object has no attribute 'Latitude'.".
Try deleting "header=None" from your code.
df = pd.read_csv(StringIO(s), sep=',')
site_coords = (38.898556, -77.037852)
df.apply(lambda row: geopy.distance.vincenty(site_coords, (row.Latitude, row.Longitude)).km, axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With