Finding index of nearest point in numpy arrays of x and y coordinates

Tags:

I have two 2d numpy arrays: x_array contains positional information in the x-direction, y_array contains positions in the y-direction.

I then have a long list of x,y points.

For each point in the list, I need to find the array index of the location (specified in the arrays) which is closest to that point.

I have naively produced some code which works, based on this question: Find nearest value in numpy array

i.e.

import time import numpy  def find_index_of_nearest_xy(y_array, x_array, y_point, x_point):     distance = (y_array-y_point)**2 + (x_array-x_point)**2     idy,idx = numpy.where(distance==distance.min())     return idy[0],idx[0]  def do_all(y_array, x_array, points):     store = []     for i in xrange(points.shape[1]):         store.append(find_index_of_nearest_xy(y_array,x_array,points[0,i],points[1,i]))     return store   # Create some dummy data y_array = numpy.random.random(10000).reshape(100,100) x_array = numpy.random.random(10000).reshape(100,100)  points = numpy.random.random(10000).reshape(2,5000)  # Time how long it takes to run start = time.time() results = do_all(y_array, x_array, points) end = time.time() print 'Completed in: ',end-start

I'm doing this over a large dataset and would really like to speed it up a bit. Can anyone optimize this?

Thanks.

UPDATE: SOLUTION following suggestions by @silvado and @justin (below)

# Shoe-horn existing data for entry into KDTree routines combined_x_y_arrays = numpy.dstack([y_array.ravel(),x_array.ravel()])[0] points_list = list(points.transpose())   def do_kdtree(combined_x_y_arrays,points):     mytree = scipy.spatial.cKDTree(combined_x_y_arrays)     dist, indexes = mytree.query(points)     return indexes  start = time.time() results2 = do_kdtree(combined_x_y_arrays,points_list) end = time.time() print 'Completed in: ',end-start

This code above sped up my code (searching for 5000 points in 100x100 matrices) by 100 times. Interestingly, using scipy.spatial.KDTree (instead of scipy.spatial.cKDTree) gave comparable timings to my naive solution, so it is definitely worth using the cKDTree version...

737

asked May 30 '12 14:05

Pete W

2 Answers

Here is a scipy.spatial.KDTree example

In [1]: from scipy import spatial  In [2]: import numpy as np  In [3]: A = np.random.random((10,2))*100  In [4]: A Out[4]: array([[ 68.83402637,  38.07632221],        [ 76.84704074,  24.9395109 ],        [ 16.26715795,  98.52763827],        [ 70.99411985,  67.31740151],        [ 71.72452181,  24.13516764],        [ 17.22707611,  20.65425362],        [ 43.85122458,  21.50624882],        [ 76.71987125,  44.95031274],        [ 63.77341073,  78.87417774],        [  8.45828909,  30.18426696]])  In [5]: pt = [6, 30]  # <-- the point to find  In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point  Out[6]: array([  8.45828909,  30.18426696])  #how it works! In [7]: distance,index = spatial.KDTree(A).query(pt)  In [8]: distance # <-- The distances to the nearest neighbors Out[8]: 2.4651855048258393  In [9]: index # <-- The locations of the neighbors Out[9]: 9  #then  In [10]: A[index] Out[10]: array([  8.45828909,  30.18426696])

answered Sep 29 '22 00:09

efirvida

scipy.spatial also has a k-d tree implementation: scipy.spatial.KDTree.

The approach is generally to first use the point data to build up a k-d tree. The computational complexity of that is on the order of N log N, where N is the number of data points. Range queries and nearest neighbour searches can then be done with log N complexity. This is much more efficient than simply cycling through all points (complexity N).

Thus, if you have repeated range or nearest neighbor queries, a k-d tree is highly recommended.

answered Sep 28 '22 23:09

silvado

Related questions
                            
                                How to make python Requests work via socks proxy
                            
                                Interactively validating Entry widget content in tkinter
                            
                                Convert True/False value read from file to boolean
                            
                                Interweaving two numpy arrays
                            
                                Access Lovoo API using Python
                            
                                Is it possible to copy a cell from one jupyter notebook to another?
                            
                                How to parse multiple nested sub-commands using python argparse?
                            
                                What does it mean for an object to be picklable (or pickle-able)?
                            
                                What are the differences between ipython and bpython?
                            
                                Automatically run %matplotlib inline in IPython Notebook
                            
                                Import python package from local directory into interpreter
                            
                                ValueError: could not convert string to float: id
                            
                                selenium with scrapy for dynamic page
                            
                                Union of 2 sets does not contain all items
                            
                                Is there any numpy group by function?
                            
                                How can I specify library versions in setup.py?
                            
                                Python setuptools: How can I list a private repository under install_requires?
                            
                                Argparse with required subparser
                            
                                how can i obtain pattern string from compiled regexp pattern in python
                            
                                open() gives FileNotFoundError/IOError: Errno 2 No such file or directory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Finding index of nearest point in numpy arrays of x and y coordinates

Tags:

python

algorithm

numpy

Pete W

People also ask

2 Answers

efirvida

silvado

Recent Activity

Donate For Us