Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the circle that covers most points in space

I was at a interview for a High Frequency Trading firm. They asked me for an algorithm of O(n):

  • Given n points in space
  • Given a hash function that returns the points of the plane in O(1),
  • Find a best matched circle that covers the most points in space.

Requirements:

  • The circle center has to have integer coordinates and it's radius will be 3.
  • Points within the circle will not necessarily have integer coordinates

I Googled and did some research. There is a O(n) algorithm (Best circle placement by Chazelle from Princeton University), but it is kind of beyond my level to understand and put it together to explain it in 10 mins. I already know O(n^2) and O(n^3) algorithms.

Please help me find an O(n) algorithm.

like image 525
user2149873 Avatar asked Mar 09 '13 00:03

user2149873


1 Answers

I guess the integer coordinate constraint simplifies the problem notably. This looks like O(n) to me:

-Make a dictionary of all integer points in space and set the entries to 0.

-For each datapoint find the integer points that are within radius 3, and add 1 to the corresponding entries of the dictionary. The reason for doing this is that the set of points that can be the centers of a circle in which that particular datapoint is inside is the integer restriction of a circle with the same radius around that datapoint. The search could be done over all points lying on a square of length 6 (thought not all points need to be evaluated explicitly as these inside the inscribed hypercube are inside for sure).

-Return the integer point corresponding to the maximum value of the dictionary, ie the center for which most datapoints are inside the circle.

Edit: I guess some code is better than explanations. This is working python with numpy and matplotlib. Shouldn't be too difficult to read:

# -*- coding: utf-8 -*-
"""
Created on Mon Mar 11 19:22:12 2013

@author: Zah
"""

from __future__ import division
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
from collections import defaultdict
import timeit
def make_points(n):
    """Generates n random points"""
    return np.random.uniform(0,30,(n,2))

def find_centers(point, r):
    """Given 1 point, find all possible integer centers searching in a square 
    around that point. Note that this method can be imporved."""
    posx, posy = point
    centers = ((x,y) 
        for x in xrange(int(np.ceil(posx - r)), int(np.floor(posx + r)) + 1)
        for y in xrange(int(np.ceil(posy - r)), int(np.floor(posy + r)) + 1)        
        if (x-posx)**2 + (y-posy)**2 < r*r)
    return centers


def find_circle(n, r=3.):
    """Find the best center"""
    points = make_points(n)
    d = defaultdict(int)
    for point in points:
        for center in find_centers(point, r):
            d[center] += 1
    return max(d , key = lambda x: d[x]), points

def make_results():
    """Green circle is the best center. Red crosses are posible centers for some
    random point as an example"""
    r = 3
    center, points = find_circle(100)
    xv,yv = points.transpose()
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.set_aspect(1)
    ax.scatter(xv,yv)
    ax.add_artist(plt.Circle(center, r, facecolor = 'g', alpha = .5, zorder = 0))
    centersx, centersy  = np.array(list(find_centers(points[0], r))).transpose()
    plt.scatter(centersx, centersy,c = 'r', marker = '+')
    ax.add_artist(plt.Circle(points[0], r, facecolor = 'r', alpha = .25, zorder = 0))
    plt.show()

if __name__ == "__main__":
    make_results()

Results: Result figure Green circle is the best one, and the red stuff demonstrates how centers are picked for some random point.

In [70]: %timeit find_circle(1000)
1 loops, best of 3: 1.76 s per loop

In [71]: %timeit find_circle(2000)
1 loops, best of 3: 3.51 s per loop

In [72]: %timeit find_circle(3000)
1 loops, best of 3: 5.3 s per loop

In [73]: %timeit find_circle(4000)
1 loops, best of 3: 7.03 s per loop

In my really slow machine. Behaviour is clearly linear.

like image 73
Zah Avatar answered Oct 30 '22 16:10

Zah