Given n points, choose a point in the given list such that the sum of distances to this point is minimum ,compared to all others.
Distance is measured in the following manner.
For a point (x,y) all 8 adjacent points have distance 1.
(x+1,y)(x+1,y+1),(x+1,y-1),(x,y+1),(x,y-1),(x-1,y)(x-1,y+1),(x-1,y-1)
EDIT
More clearer explanation.
A function foo is defined as
foo(point_a,point_b) = max(abs(point_a.x - point_b.x),abs(point_a.y - point_b.y))
Find a point x such that sum([foo(x,y) for y in list_of_points]) is minimum.
Example
Input:
12 -14
-3 3
-14 7
-14 -3
2 -12
-1 -6
Output
-1 -6
Eg: Distance between (4,5) and 6,7) is 2.
This can be done in O(n^2) time, by checking the sum of each pair. Is there any better algorithm to do it?
For a set of two numbers, the value of the median will be the same as the value of the mean. For example, 2 and 10 are the two numbers. Also, mean = (2+10)/2 = 12/2 = 6. Hence, the median of two numbers is equal to the mean.
Median, in statistics, is the middle value of the given list of data, when arranged in an order. The arrangement of data or observations can be done either in ascending order or descending order.
For example, take this list of numbers: 10, 10, 20, 40, 70. The mean (informally, the “average“) is found by adding all of the numbers together and dividing by the number of items in the set: 10 + 10 + 20 + 40 + 70 / 5 = 30. The median is found by ordering the set from lowest to highest and finding the exact middle.
Update: it sometimes fails to find the optimum, I'll leave this here till I find the problem.
this is O(n)
: nth is O(n) (expected, not worst), iterating over the list is O(n). If you need strict O() then pick the middle element with sorting but then it's going to be O(n*log(n)).
Note: it's easy to modifiy it to return all the optimal points.
import sys
def nth(sample, n):
pivot = sample[0]
below = [s for s in sample if s < pivot]
above = [s for s in sample if s > pivot]
i, j = len(below), len(sample)-len(above)
if n < i: return nth(below, n)
elif n >= j: return nth(above, n-j)
else: return pivot
def getbest(li):
''' li is a list of tuples (x,y) '''
l = len(li)
lix = [x[0] for x in li]
liy = [x[1] for x in li]
mid_x1 = nth(lix, l/2) if l%2==1 else nth(lix, l/2-1)
mid_x2 = nth(lix, l/2)
mid_y1 = nth(liy, l/2) if l%2==1 else nth(liy, l/2-1)
mid_y2 = nth(liy, l/2)
mindist = sys.maxint
minp = None
for p in li:
dist = 0 if mid_x1 <= p[0] <= mid_x2 else min(abs(p[0]-mid_x1), abs(p[0]-mid_x2))
dist += 0 if mid_y1 <= p[1] <= mid_y2 else min(abs(p[1]-mid_y1), abs(p[1]-mid_y2))
if dist < mindist:
minp, mindist = p, dist
return minp
It's based on the solution of the one dimensional problem - for a list of numbers find a number for which the sum distance is the minimum.
The solution for this is the middle element of the (sorted) list or any number between the two middle elements (including these two elements) if there are an even number of elements in the list.
Update: my nth
algorithm seems to be very slow, probably there is a better way to rewrite it, sort
outperforms it with < 100000 elements, so if you do speed comparison, just add sort(lix); sort(liy);
and
def nth(sample, n):
return sample[n]
For anyone out there who wants to test his solution, here is what I use. Just run a loop, generate input and compare your solution with the output of bruteforce.
import random
def example(length):
l = []
for x in range(length):
l.append((random.randint(-100, 100), random.randint(-100,100)))
return l
def bruteforce(li):
bestsum = sys.maxint
bestp = None
for p in li:
sum = 0
for p1 in li:
sum += max(abs(p[0]-p1[0]), abs(p[1]-p1[1]))
if sum < bestsum:
bestp, bestsum = p, sum
return bestp
I can imagine a scheme better than O(n^2), at least in the common case.
Build a quadtree out of your input points. For each node in the tree, compute the number and average position of the points within that node. Then for each point, you can use the quadtree to compute its distance to all other points in less than O(n) time. If you're computing the distance from a point p to a distant quadtree node v, and v doesn't overlap the 45 degree diagonals from p, then the total distance from p to all the points in v is easy to compute (for v which are more horizontally than vertically separated from p, it is just v.num_points * |p.x - v.average.x|
, and similarly using y coordinates if v is predominately vertically seperated). If v overlaps one of the 45 degree diagonals, recurse on its components.
That should beat O(n^2), at least when you can find a balanced quadtree to represent your points.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With