I have two arrays of x-y coordinates, and I would like to find the minimum Euclidean distance between each point in one array with all the points in the other array. The arrays are not necessarily the same size. For example:
xy1=numpy.array( [[ 243, 3173], [ 525, 2997]]) xy2=numpy.array( [[ 682, 2644], [ 277, 2651], [ 396, 2640]])
My current method loops through each coordinate xy
in xy1
and calculates the distances between that coordinate and the other coordinates.
mindist=numpy.zeros(len(xy1)) minid=numpy.zeros(len(xy1)) for i,xy in enumerate(xy1): dists=numpy.sqrt(numpy.sum((xy-xy2)**2,axis=1)) mindist[i],minid[i]=dists.min(),dists.argmin()
Is there a way to eliminate the for loop and somehow do element-by-element calculations between the two arrays? I envision generating a distance matrix for which I could find the minimum element in each row or column.
Another way to look at the problem. Say I concatenate xy1
(length m) and xy2
(length p) into xy
(length n), and I store the lengths of the original arrays. Theoretically, I should then be able to generate a n x n distance matrix from those coordinates from which I can grab an m x p submatrix. Is there a way to efficiently generate this submatrix?
Given a set of points in the two-dimensional plane, your task is to find the minimum Euclidean distance between two distinct points. The Euclidean distance of points (x1,y1) and (x2,y2) is √(x1−x2)2+(y1−y2)2.
dist() method in Python is used to the Euclidean distance between two points p and q, each given as a sequence (or iterable) of coordinates. The two points must have the same dimension. This method is new in Python version 3.8. Returns: the calculated Euclidean distance between the given points.
(Months later) scipy.spatial.distance.cdist( X, Y )
gives all pairs of distances, for X and Y 2 dim, 3 dim ...
It also does 22 different norms, detailed here .
# cdist example: (nx,dim) (ny,dim) -> (nx,ny) from __future__ import division import sys import numpy as np from scipy.spatial.distance import cdist #............................................................................... dim = 10 nx = 1000 ny = 100 metric = "euclidean" seed = 1 # change these params in sh or ipython: run this.py dim=3 ... for arg in sys.argv[1:]: exec( arg ) np.random.seed(seed) np.set_printoptions( 2, threshold=100, edgeitems=10, suppress=True ) title = "%s dim %d nx %d ny %d metric %s" % ( __file__, dim, nx, ny, metric ) print "\n", title #............................................................................... X = np.random.uniform( 0, 1, size=(nx,dim) ) Y = np.random.uniform( 0, 1, size=(ny,dim) ) dist = cdist( X, Y, metric=metric ) # -> (nx, ny) distances #............................................................................... print "scipy.spatial.distance.cdist: X %s Y %s -> %s" % ( X.shape, Y.shape, dist.shape ) print "dist average %.3g +- %.2g" % (dist.mean(), dist.std()) print "check: dist[0,3] %.3g == cdist( [X[0]], [Y[3]] ) %.3g" % ( dist[0,3], cdist( [X[0]], [Y[3]] )) # (trivia: how do pairwise distances between uniform-random points in the unit cube # depend on the metric ? With the right scaling, not much at all: # L1 / dim ~ .33 +- .2/sqrt dim # L2 / sqrt dim ~ .4 +- .2/sqrt dim # Lmax / 2 ~ .4 +- .2/sqrt dim
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With