I have a list of US ZIP codes and I have to calculate distance between all the ZIP Code Points. Its a 6k ZIPs long list, each entity has ZIP, City, State, Lat, Long, Area and Population.
So, I have to calculate distance between all the points, ie; 6000C2 combinations.
Here is a sample of my data
I've tried this in SAS but its too slow and inefficient, hence I'm looking for a way using Python or R.
Any leads would be appreciated.
Python Solution
If you have the corresponding latitudes and longitudes for the Zip codes, you can directly calculate the distance between them by using Haversine formula using 'mpu' library which determines the great-circle distance between two points on a sphere.
Example Code :
import mpu
zip_00501 =(40.817923,-73.045317)
zip_00544 =(40.788827,-73.039405)
dist =round(mpu.haversine_distance(zip_00501,zip_00544),2)
print(dist)
You will get the resultant distance in kms. Output:
3.27
PS. If you don't have the corresponding coordinates for the zip codes, you can get the same using 'SearchEngine' module of 'uszipcode' library (only for US zip codes)
from uszipcode import SearchEngine
#for extensive list of zipcodes, set simple_zipcode =False
search = SearchEngine(simple_zipcode=True)
zip1 = search.by_zipcode('92708')
lat1 =zip1.lat
long1 =zip1.lng
zip2 =search.by_zipcode('53404')
lat2 =zip2.lat
long2 =zip2.lng
mpu.haversine_distance((lat1,long1),(lat2,long2))
Hope this helps!!
In SAS, use the GEODIST
function.
GEODIST Function
Returns the geodetic distance between two latitude and longitude coordinates.
…
Syntax
GEODIST(latitude-1, longitude-1, latitude-2, longitude-2 <, options>)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With