Python code 60x slower than Java [closed]

Question

This code performs Haversine distance calculations and is part of a larger project.

The Java implementation seems to be 60x faster than Python. They have nearly identical implementations.

This performance is on the same machine and OS. I have tried various combinations:

OS : Linux and Windows
Machine with different CPUs (i5 from 4th and 6th gen)

but the difference in execution speed is consistent.

Python implementation:

from math import radians, sin, cos, sqrt, asin
import time 

def haversine(lat1, lon1, lat2, lon2):

  R = 6372.8 # Earth radius in kilometers

  dLat = radians(lat2 - lat1)
  dLon = radians(lon2 - lon1)
  lat1 = radians(lat1)
  lat2 = radians(lat2)

  a = sin(dLat/2)**2 + cos(lat1)*cos(lat2)*sin(dLon/2)**2
  c = 2*asin(sqrt(a))

  return R * c

start_time = time.time()
#START: Performance critical part
for x in range(1, 1*1000*1000*10): 
    haversine(36.12, -86.67, 33.94, -118.40)
#END: Performance critical part
elapsed_time = time.time() - start_time
print("Python elapsed time = " + str(elapsed_time)+" seconds")

Java implementation:

import java.util.Date;

public class HaversineTest {
    public static final double R = 6372.8; // In kilometers
    public static double haversine(double lat1, double lon1, double lat2, double lon2) {
        double dLat = Math.toRadians(lat2 - lat1);
        double dLon = Math.toRadians(lon2 - lon1);
        lat1 = Math.toRadians(lat1);
        lat2 = Math.toRadians(lat2);

        double a = Math.pow(Math.sin(dLat / 2),2) + Math.pow(Math.sin(dLon / 2),2) * Math.cos(lat1) * Math.cos(lat2);
        double c = 2 * Math.asin(Math.sqrt(a));
        return R * c;
    }
    public static void main(String[] args) {
        int loopCount = 1*1000*1000*10;
        long start_time = new Date().getTime();
        for(int i=0;i<loopCount;i++){
            haversine(36.12, -86.67, 33.94, -118.40);
        }
        long end_time = new Date().getTime();
        long elapsed_time=end_time-start_time;
        System.out.println("Java elapsed time = "+elapsed_time+" ms");
    }
}

Versions :

Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32

java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Outputs:

Java elapsed time = 229 ms

Python elapsed time = 15.028019428253174 seconds

Can I improve the Python performance?

MSeifert · Accepted Answer

There are two major bottlenecks:

The loop
The function calls (the haversine function and the math functions).

If you use numpy and operate on arrays you can only need to call the functions once because the loop is vectorized over the arrays - net result the code runs 30 times faster on my computer (code is taken from Prunes answer but changed so it works with numpy arrays):

import numpy as np

R = 6372.8 # Earth radius in kilometers

def haversine(lat1, lon1, lat2, lon2):
    dLat = np.radians(lat2 - lat1)
    dLon = np.radians(lon2 - lon1)
    lat1 = np.radians(lat1)
    lat2 = np.radians(lat2)
    sinLat = np.sin(dLat/2)
    sinLon = np.sin(dLon/2)

    a = sinLat * sinLat + np.cos(lat1) * np.cos(lat2) * sinLon * sinLon
    c = 2 * np.arcsin(np.sqrt(a))

    return R * c

haversine(np.ones(10000000) * 36.12,
          np.ones(10000000) * -86.67,
          np.ones(10000000) * 33.94,
          np.ones(10000000) * -118.40)

But that creates a lot of huge temporary arrays, with numba you can avoid them and still use the math functions:

from math import radians, sin, cos, sqrt, asin
from numba import njit

@njit
def haversine(lat1, lon1, lat2, lon2):

    R = 6372.8 # Earth radius in kilometers

    dLat = radians(lat2 - lat1)
    dLon = radians(lon2 - lon1)
    lat1 = radians(lat1)
    lat2 = radians(lat2)

    a = sin(dLat/2)**2 + cos(lat1)*cos(lat2)*sin(dLon/2)**2
    c = 2*asin(sqrt(a))

    return R * c

@njit
def haversine_loop():
    res = np.empty(10*1000*1000, dtype=np.float_)
    for x in range(0, 10*1000*1000): 
        res[x] = haversine(36.12, -86.67, 33.94, -118.40)
    return res

haversine_loop()

That actually gives an 50 times improvement over the numpy code (so 150 times faster in the end). But you need to check if such an approach is feasible in your case (numba is not a lightweight dependency!).

Python code 60x slower than Java [closed]

Tags:

java

performance

python

Python implementation:

Java implementation:

Shamit Verma

1 Answers

MSeifert

Recent Activity

Donate For Us

Python code 60x slower than Java [closed]

Tags:

java

performance

python

Python implementation:

Java implementation:

Shamit Verma

1 Answers

MSeifert

Related questions

Recent Activity

Donate For Us