I need to solve a computational problem that boils down to searching for reciprocally-nearest pairs of points between two sets. The problem goes something like this:
Given a set of points A and a set of points B in euclidean space, find all pairs (a,b) such that b is the closest point in B to a and a is the closest point in A to b.
The sets A and B are of approximately equal size, and we will call this size N. For my particular problem N is approximately 250,000.
The brute force solution is to compare every point against every other point, which has quadratic time complexity. Is there any more efficient algorithm for doing this?
The closest pair is the minimum of the closest pairs within each half and the closest pair between the two halves. To split the point set in two, we find the x-median of the points and use that as a pivot. Finding the closest pair of points in each half is subproblem that is solved recursively.
In this problem, a set of n points are given on the 2D plane. In this problem, we have to find the pair of points, whose distance is minimum. To solve this problem, we have to divide points into two halves, after that smallest distance between two points is calculated in a recursive way.
Brute-Force Method — Finding the Closest Pair The brute-force way is, like one that counts inversions in an array, to calculate the distances of every pair of points in the universe. For n number of points, we would need to measure n(n-1)/2 distances and the cost is square to n, or Θ(n²).
A data structure I found very useful when I had to do nearest neighbour searches was a kd-tree. Wikipedia has a nice overview and this is an excellent in-depth discussion of the algorithm if you're implementing your own (although a library may well exist already - you don't mention which language you're using). The most important thing about a kd-tree is that it allows nearest-neghbour searches to be performed in O(log N) time.
In that way, you could produce two lists of - the members of A and their nearest neighbour in B and the members of B and their nearest neighbour in A - in O(N log N) time. Then, you could compare the lists to see which pairs match. Done naively, that's O(N^2), though you might be able to think of a way to do it faster.
[edit] You've got me thinking; here's my second thought:
for(a in A)
b := nearest(B, a)
if a = nearest(A, b)
add (a, b) to results
end if
end for
function nearest(X, y)
return nearest member of set X to point y
end function
By my reckoning, that's O(N log N).
Sorry for picking up a rather old thread but I just wanted to add a solution I've found in my textbook for an Algorithm Design class:
There is a divide-and-conquer (think merge-sort) approach to solve this problem that should be O(n logn)
, I've only seen it for finding the shortest distance within one set of points but it should be easily adapted to require each pairing to consist of points from different sets.
d
)p
) in the left half and check the distance for all points between p_x
and p_x + d
, if any of these distances are shorter than d
that is the d
to return, otherwise return d
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With