Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Choose the closest k points from given n points

Tags:

You are given a set U of n points on the plane and you can compute distance between any pair of points in constant time. Choose a subset of U called C such that C has exactly k points in it and the distance between the farthest 2 points in C is as small as possible for given k. 1 < k <= n

What's the fastest way to do this besides the obvious n-choose-k solution?

like image 949
pathikrit Avatar asked Mar 30 '11 21:03

pathikrit


People also ask

How do you find the closest point to a set of points?

The closest pair is the minimum of the closest pairs within each half and the closest pair between the two halves. To split the point set in two, we find the x-median of the points and use that as a pivot. Finding the closest pair of points in each half is subproblem that is solved recursively.

Which data structure would you use to query the K nearest points of a set on a 2D plane?

A KD Tree does the job. It's a geometric data structure that can find nearest neighbors efficiently by cutting the search space similarly to a binary search.


2 Answers

A solution is shown in Finding k points with minimum diameter and related problems - Aggarwal, 1991. The algorithm described therein has running time: O(k^2.5 n log k + n log n)

For those who have no access to the paper: the problem is called k-diameter and definied as

Find a set of k points with minimum diameter. The diameter of a set is the maximum distance between any points of the set.

I cannot really give an overview over the presented algorithm, but it includes computing the (3k - 3)th order Voronoi diagram of the points, and then solve the problem for each of the O(kn) Voronoi sets (by computing maximum independent sets in some bipartite graphs)... I guess that I am trying to say is, that it is highly non-trivial, and far beyond both an interview and this site :-)

like image 156
dcn Avatar answered Oct 03 '22 18:10

dcn


Since this is an interview question, here is my shot at a solution. (As dcn points out below, this is not guaranteed to return the optimal solution, though it should still be a decent heuristic. Good catch, dcn!)

  1. Create a set Sp with a single point P.
  2. Compute the distance between every point in Sp and every point outside of it, then add the point with the smallest max distance to Sp.
  3. Repeat 2. until Sp has k points.
  4. Repeat 1-3 using each point once as the initial P. Take the Sp which has the smallest max distance.

There are O(k) points in Sp, and O(n) points outside of it, so finding the point with the smallest max distance is O(nk). We repeat this k times, then repeat the whole procedure n times, for an overall complexity of O(n2k2).

We can improve on this by caching the max distance between any point in Sp and each point outside of Sp. If maxDistanceFromPointInS[pointOutsideS] is, say, an O(1) hash-table containing the current max distance between every point pointOutsideS and some point inside Sp, then every time we add a new point newPoint, we set maxDistanceFromPointInS[p] = Max(maxDistanceFromPointInS[p], distance(newPoint, p)) for all points p outside of Sp. Then finding the smallest max distance is O(n), adding a point to Sp is O(n). Repeating this k times gives us O((n+n)k) = O(nk). Finally, we repeat the whole procedure n times, for an overall complexity of O(n2k).

We could improve finding the smallest max distance to O(1) using a heap, but that would not change the overall complexity.


By the way, it took an hour to write this all up - there's no way I could have done this in an interview.

like image 35
BlueRaja - Danny Pflughoeft Avatar answered Oct 03 '22 20:10

BlueRaja - Danny Pflughoeft