So lets say I have 10,000 points in A and 10,000 points in B and want to find out the closest point in A for every B point. Currently, I simply loop through every point in B and A to find which one is closest in distance. ie. <pre class="prettyprint"><code>B = [(.5, 1, 1), (1, .1, 1), (1, 1, .2)] A = [(1, 1, .3), (1, 0, 1), (.4, 1, 1)] C = {} for bp in B: closestDist = -1 for ap in A: dist = sum(((bp[0]-ap[0])**2, (bp[1]-ap[1])**2, (bp[2]-ap[2])**2)) if(closestDist > dist or closestDist == -1): C[bp] = ap closestDist = dist print C </code></pre> However, I am sure there is a faster way to do this... any ideas?

I typically use a kd-tree in such situations. There is a C++ implementation wrapped with SWIG and bundled with BioPython that's easy to use.

You could use numpy broadcasting. For example, <pre class="prettyprint"><code>from numpy import * import numpy as np a=array(A) b=array(B) #using looping for i in b: print sum((a-i)**2,1).argmin() </code></pre> will print 2,1,0 which are the rows in a that are closest to the 1,2,3 rows of B, respectively. Otherwise, you can use broadcasting: <pre class="prettyprint"><code>z = sum((a[:,:, np.newaxis] - b)**2,1) z.argmin(1) # gives array([2, 1, 0]) </code></pre> I hope that helps.

Fastest way to find the closest point to a given point in 3D, in Python

Tags:

python

distance

points

closest

So lets say I have 10,000 points in A and 10,000 points in B and want to find out the closest point in A for every B point.

Currently, I simply loop through every point in B and A to find which one is closest in distance. ie.

B = [(.5, 1, 1), (1, .1, 1), (1, 1, .2)]
A = [(1, 1, .3), (1, 0, 1), (.4, 1, 1)]
C = {}
for bp in B:
   closestDist = -1
   for ap in A:
      dist = sum(((bp[0]-ap[0])**2, (bp[1]-ap[1])**2, (bp[2]-ap[2])**2))
      if(closestDist > dist or closestDist == -1):
         C[bp] = ap
         closestDist = dist
print C

However, I am sure there is a faster way to do this... any ideas?

784

asked Apr 14 '10 21:04

Saebin

3 Answers

I typically use a kd-tree in such situations.

There is a C++ implementation wrapped with SWIG and bundled with BioPython that's easy to use.

answered Oct 17 '22 04:10

awesomo

You could use some spatial lookup structure. A simple option is an octree; fancier ones include the BSP tree.

answered Oct 17 '22 03:10

Thomas

You could use numpy broadcasting. For example,

from numpy import *
import numpy as np

a=array(A)
b=array(B)
#using looping
for i in b:
    print sum((a-i)**2,1).argmin()

will print 2,1,0 which are the rows in a that are closest to the 1,2,3 rows of B, respectively.

Otherwise, you can use broadcasting:

z = sum((a[:,:, np.newaxis] - b)**2,1)
z.argmin(1) # gives array([2, 1, 0])

I hope that helps.

answered Oct 17 '22 04:10

reckoner

Related questions
                            
                                Join the two images
                            
                                Heroku python app failing to build when installing sqlite3
                            
                                How to mount google drive to R notebook in colab?
                            
                                Why does * work differently in assignment statements versus function calls?
                            
                                Run inference on CPU using pytorch and multiprocessing
                            
                                Unable to let my script generate few values automatically to be used within payload
                            
                                Why am I getting the "MySQL server has gone away" exception in Django?
                            
                                How to define Python Enum properties if MySQL ENUM values have space in their names?
                            
                                How to suppress specific warning in Tensorflow (Python)
                            
                                Keras iterator with augmented images and other features
                            
                                Msys2 with python 3.8: ImportError: cannot import name 'open_code' from 'io'
                            
                                Why can a Python script file not be named abc.py?
                            
                                Tkinter matplotlib canvas updates too slowly for real time data
                            
                                Breaking cycles in a digraph with the condition of preserving connectivity for certain nodes
                            
                                create golang bindings for a python module
                            
                                TypeError: __call__() missing 1 required positional argument: 'send' Django
                            
                                How do I handle exceptions when using threading and Queue?
                            
                                Run a Python project in Eclipse as root
                            
                                How can I draw nodes and edges in PyQT?
                            
                                Implementing the factory design pattern using metaclasses

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With