This code performs Haversine distance calculations and is part of a larger project.
The Java implementation seems to be 60x faster than Python. They have nearly identical implementations.
This performance is on the same machine and OS. I have tried various combinations:
but the difference in execution speed is consistent.
from math import radians, sin, cos, sqrt, asin
import time
def haversine(lat1, lon1, lat2, lon2):
R = 6372.8 # Earth radius in kilometers
dLat = radians(lat2 - lat1)
dLon = radians(lon2 - lon1)
lat1 = radians(lat1)
lat2 = radians(lat2)
a = sin(dLat/2)**2 + cos(lat1)*cos(lat2)*sin(dLon/2)**2
c = 2*asin(sqrt(a))
return R * c
start_time = time.time()
#START: Performance critical part
for x in range(1, 1*1000*1000*10):
haversine(36.12, -86.67, 33.94, -118.40)
#END: Performance critical part
elapsed_time = time.time() - start_time
print("Python elapsed time = " + str(elapsed_time)+" seconds")
import java.util.Date;
public class HaversineTest {
public static final double R = 6372.8; // In kilometers
public static double haversine(double lat1, double lon1, double lat2, double lon2) {
double dLat = Math.toRadians(lat2 - lat1);
double dLon = Math.toRadians(lon2 - lon1);
lat1 = Math.toRadians(lat1);
lat2 = Math.toRadians(lat2);
double a = Math.pow(Math.sin(dLat / 2),2) + Math.pow(Math.sin(dLon / 2),2) * Math.cos(lat1) * Math.cos(lat2);
double c = 2 * Math.asin(Math.sqrt(a));
return R * c;
}
public static void main(String[] args) {
int loopCount = 1*1000*1000*10;
long start_time = new Date().getTime();
for(int i=0;i<loopCount;i++){
haversine(36.12, -86.67, 33.94, -118.40);
}
long end_time = new Date().getTime();
long elapsed_time=end_time-start_time;
System.out.println("Java elapsed time = "+elapsed_time+" ms");
}
}
Versions :
Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
Outputs:
Java elapsed time = 229 ms
Python elapsed time = 15.028019428253174 seconds
Can I improve the Python performance?
There are two major bottlenecks:
haversine function and the math functions).If you use numpy and operate on arrays you can only need to call the functions once because the loop is vectorized over the arrays - net result the code runs 30 times faster on my computer (code is taken from Prunes answer but changed so it works with numpy arrays):
import numpy as np
R = 6372.8 # Earth radius in kilometers
def haversine(lat1, lon1, lat2, lon2):
dLat = np.radians(lat2 - lat1)
dLon = np.radians(lon2 - lon1)
lat1 = np.radians(lat1)
lat2 = np.radians(lat2)
sinLat = np.sin(dLat/2)
sinLon = np.sin(dLon/2)
a = sinLat * sinLat + np.cos(lat1) * np.cos(lat2) * sinLon * sinLon
c = 2 * np.arcsin(np.sqrt(a))
return R * c
haversine(np.ones(10000000) * 36.12,
np.ones(10000000) * -86.67,
np.ones(10000000) * 33.94,
np.ones(10000000) * -118.40)
But that creates a lot of huge temporary arrays, with numba you can avoid them and still use the math functions:
from math import radians, sin, cos, sqrt, asin
from numba import njit
@njit
def haversine(lat1, lon1, lat2, lon2):
R = 6372.8 # Earth radius in kilometers
dLat = radians(lat2 - lat1)
dLon = radians(lon2 - lon1)
lat1 = radians(lat1)
lat2 = radians(lat2)
a = sin(dLat/2)**2 + cos(lat1)*cos(lat2)*sin(dLon/2)**2
c = 2*asin(sqrt(a))
return R * c
@njit
def haversine_loop():
res = np.empty(10*1000*1000, dtype=np.float_)
for x in range(0, 10*1000*1000):
res[x] = haversine(36.12, -86.67, 33.94, -118.40)
return res
haversine_loop()
That actually gives an 50 times improvement over the numpy code (so 150 times faster in the end). But you need to check if such an approach is feasible in your case (numba is not a lightweight dependency!).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With