In my project I need to compute euclidian distance beetween each points stored in an array. The entry array is a 2D numpy array with 3 columns which are the coordinates(x,y,z) and each rows define a new point. I'm usualy working with 5000 - 6000 points in my test cases. My first algorithm use Cython and my second numpy. I find that my numpy algorithm is faster than cython. edit: with 6000 points : numpy 1.76 s / cython 4.36 s Here's my cython code: <pre class="prettyprint"><code>cimport cython from libc.math cimport sqrt @cython.boundscheck(False) @cython.wraparound(False) cdef void calcul1(double[::1] M,double[::1] R): cdef int i=0 cdef int max = M.shape[0] cdef int x,y cdef int start = 1 for x in range(0,max,3): for y in range(start,max,3): R[i]= sqrt((M[y] - M[x])**2 + (M[y+1] - M[x+1])**2 + (M[y+2] - M[x+2])**2) i+=1 start += 1 </code></pre> M is a memory view of the initial entry array but <code>flatten()</code> by numpy before the call of the function <code>calcul1()</code>, R is a memory view of a 1D output array to store all the results. Here's my Numpy code : <pre class="prettyprint"><code>def calcul2(M): return np.sqrt(((M[:,:,np.newaxis] - M[:,np.newaxis,:])**2).sum(axis=0)) </code></pre> Here M is the initial entry array but <code>transpose()</code> by numpy before the function call to have coordinates(x,y,z) as rows and points as columns. Moreover this numpy function is quite convinient because the array it returns is well organise. It's a n by n array with n the number of points and each points has a row and a column. So for example the distance AB is stored at the intersection index of row A and column B. Here's how I call them (cython function): <pre class="prettyprint"><code>cpdef test(): cdef double[::1] Mf cdef double[::1] out = np.empty(17998000,dtype=np.float64) # (6000² - 6000) / 2 M = np.arange(6000*3,dtype=np.float64).reshape(6000,3) # Example array with 6000 points Mf = M.flatten() #because my cython algorithm need a 1D array Mt = M.transpose() # because my numpy algorithm need coordinates as rows calcul2(Mt) calcul1(Mf,out) </code></pre> Am I doing something wrong here ? For my project both are not fast enough. 1: Is there a way to improve my cython code in order to beat numpy's speed ? 2: Is there a way to improve my numpy code to compute even faster ? 3: Or any other solutions, but it must be a python/cython (like parallel computing) ? Thank you.

Not sure where you are getting your timings, but you can use <code>scipy.spatial.distance</code>: <pre class="prettyprint"><code>M = np.arange(6000*3, dtype=np.float64).reshape(6000,3) np_result = calcul2(M) sp_result = sd.cdist(M.T, M.T) #Scipy usage np.allclose(np_result, sp_result) >>> True </code></pre> Timings: <pre class="prettyprint"><code>%timeit calcul2(M) 1000 loops, best of 3: 313 µs per loop %timeit sd.cdist(M.T, M.T) 10000 loops, best of 3: 86.4 µs per loop </code></pre> Importantly, its also useful to realize that your output is symmetric: <pre class="prettyprint"><code>np.allclose(sp_result, sp_result.T) >>> True </code></pre> An alternative is to only compute the upper triangular of this array: <pre class="prettyprint"><code>%timeit sd.pdist(M.T) 10000 loops, best of 3: 39.1 µs per loop </code></pre> Edit: Not sure which index you want to zip, looks like you may be doing it both ways? Zipping the other index for comparison: <pre class="prettyprint"><code>%timeit sd.pdist(M) 10 loops, best of 3: 135 ms per loop </code></pre> Still about 10x faster than your current NumPy implementation.

Fastest way to compute distance beetween each points in python

Tags:

python

optimization

numpy

cython

In my project I need to compute euclidian distance beetween each points stored in an array. The entry array is a 2D numpy array with 3 columns which are the coordinates(x,y,z) and each rows define a new point.

I'm usualy working with 5000 - 6000 points in my test cases.

My first algorithm use Cython and my second numpy. I find that my numpy algorithm is faster than cython.

edit: with 6000 points :

numpy 1.76 s / cython 4.36 s

Here's my cython code:

cimport cython
from libc.math cimport sqrt
@cython.boundscheck(False)
@cython.wraparound(False)
cdef void calcul1(double[::1] M,double[::1] R):

  cdef int i=0
  cdef int max = M.shape[0]
  cdef int x,y
  cdef int start = 1

  for x in range(0,max,3):
     for y in range(start,max,3):

        R[i]= sqrt((M[y] - M[x])**2 + (M[y+1] - M[x+1])**2 + (M[y+2] - M[x+2])**2)
        i+=1  

     start += 1

M is a memory view of the initial entry array but flatten() by numpy before the call of the function calcul1(), R is a memory view of a 1D output array to store all the results.

Here's my Numpy code :

def calcul2(M):

     return np.sqrt(((M[:,:,np.newaxis] - M[:,np.newaxis,:])**2).sum(axis=0))

Here M is the initial entry array but transpose() by numpy before the function call to have coordinates(x,y,z) as rows and points as columns.

Moreover this numpy function is quite convinient because the array it returns is well organise. It's a n by n array with n the number of points and each points has a row and a column. So for example the distance AB is stored at the intersection index of row A and column B.

Here's how I call them (cython function):

cpdef test():

  cdef double[::1] Mf 
  cdef double[::1] out = np.empty(17998000,dtype=np.float64) # (6000² - 6000) / 2

  M = np.arange(6000*3,dtype=np.float64).reshape(6000,3) # Example array with 6000 points
  Mf = M.flatten() #because my cython algorithm need a 1D array
  Mt = M.transpose() # because my numpy algorithm need coordinates as rows

  calcul2(Mt)

  calcul1(Mf,out)

Am I doing something wrong here ? For my project both are not fast enough.

1: Is there a way to improve my cython code in order to beat numpy's speed ?

2: Is there a way to improve my numpy code to compute even faster ?

3: Or any other solutions, but it must be a python/cython (like parallel computing) ?

Thank you.

815

asked May 18 '16 11:05

UserAt

Video Answer

1 Answers

Not sure where you are getting your timings, but you can use scipy.spatial.distance:

M = np.arange(6000*3, dtype=np.float64).reshape(6000,3)
np_result = calcul2(M)
sp_result = sd.cdist(M.T, M.T) #Scipy usage
np.allclose(np_result, sp_result)
>>> True

Timings:

%timeit calcul2(M)
1000 loops, best of 3: 313 µs per loop

%timeit sd.cdist(M.T, M.T)
10000 loops, best of 3: 86.4 µs per loop

Importantly, its also useful to realize that your output is symmetric:

np.allclose(sp_result, sp_result.T)
>>> True

An alternative is to only compute the upper triangular of this array:

%timeit sd.pdist(M.T)
10000 loops, best of 3: 39.1 µs per loop

Edit: Not sure which index you want to zip, looks like you may be doing it both ways? Zipping the other index for comparison:

%timeit sd.pdist(M)
10 loops, best of 3: 135 ms per loop

Still about 10x faster than your current NumPy implementation.

200

answered Sep 29 '22 20:09

Daniel

Related questions
                            
                                Python: Predict the y value using Statsmodels - Linear Regression
                            
                                Failed processing format-parameters with mysql.connector in Python
                            
                                When is chr(ord(c)) not equal to c in Python?
                            
                                python speed processing per line VS in chunk
                            
                                Passing SOME of the parameters to a function in python
                            
                                How do I override the str function without raising a UnicodeEncodeError?
                            
                                How to use python to convert a float number to fixed point with predefined number of bits
                            
                                I think immutable types like frozenset and tuple not actually copied. What is this called? Does it have any implications?
                            
                                In flask how do i call data from another function/route in another view as explained below
                            
                                how can we use scipy.signal.resample to downsample the speech signal from 44100 to 8000 Hz signal?
                            
                                Parallelize a function call with mpi4py
                            
                                Why is pandas inserting spaces in my histogram?
                            
                                Django admin form, field instead of object in foreign key
                            
                                creating Mat with openCV in python
                            
                                python multiprocessing map mishandling of last processes
                            
                                ImportError: No module named appengine.api
                            
                                Pandas: AttributeError: 'DataFrame' object has no attribute 'agg'
                            
                                neomodel giving Attribute error on save
                            
                                What's the Ruby equivalent of Python's defaultdict?
                            
                                Using hyphen/dash in python repository name and package name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With