Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for minimum manhattan distance

Tags:

algorithm

I wish to find the point with the minimum sum of manhattan distance/rectilinear distance from a set of points (i.e the sum of rectilinear distance between this point and each point in the set should be minimum ). The resulting point can be one of the points from the given set (not necessarily). In case more than one points exist with the same minimum distance, i wish to retrieve all of them.

IN OTHER WORDS:

I have a grid with certain intersections marked. I'd like to find the intersection that is closest to all the marked intersections. That is, I need to find a point such that the sum of distances from all points is minimum.

like image 930
user1045047 Avatar asked May 01 '12 18:05

user1045047


People also ask

How do you code a Manhattan distance?

Manhattan distance between two points (x1, y1) and (x2, y2) is considered as abs(x1 - x2) + abs(y1 - y2), where abs(x) is the absolute value of x.

How do you find the shortest distance between points?

The shortest distance between two points is a straight line. This distance can be calculated by using the distance formula. The distance between two points ( x 1 , y 1 ) and ( x 2 , y 2 ) can be defined as d = ( x 2 − x 1 ) 2 + ( y 2 − y 1 ) 2 .

How do you find the maximum distance between two points on a Manhattan?

The Manhattan Distance between two points (X1, Y1) and (X2, Y2) is given by |X1 – X2| + |Y1 – Y2|.

What is an alternative form of Manhattan distance?

Also known as rectilinear distance, Minkowski's L1 distance, taxi cab metric, or city block distance. Hamming distance can be seen as Manhattan distance between bit vectors.


1 Answers

The cool thing about the Manhatan distance is that the distance itself comprises of two independent components: the distance on the x and y coordinate. Thus you can solve two simpler tasks and merge the results from them to obtain the desired results.

The task I speak of is: given are points on a line. Find the point on the line that minimizes the sum of the absolute distances to all the points. If there are many find all of them (btw they always turn to be a single segment which is easy to prove). The segment is determined by the (potentially two) points medians of the set. By median I mean the point that has equal number of points to the left and to the right of it. In case the number of points is even there is no such point and you choose the points with difference 1 in both directions to form the segment.

Here I add examples of solutions of this simpler task:

In case the points on the line are like that:

-4 | | | 0 | 2 3 4
             ^

The solution is just a point and it is 2.

In case the points on the line are like that:

-4 | | | 0 | 2 3
         ^---^

The whole segment [0, 2] is the solution of the problem.

You solve this task separately for the x and y coordinate and then merge the results to obtain the rectangle of minimum distanced points.


EXAMPLE

And now comes an example of the solution for the initial task.

Imagine you want to find the points that are with minimum Manhatan distance to the set (0, 6), (1, 3), (3, 5), (3, 3), (4, 7), (2, 4)

You form the two simpler tasks:

For x:

0 1 2 3 3 4
    ^-^

And here the solution is the segment [2, 3] (note that here we have duplicated point 3, which I represented in probably not the most intuitive way).

For y:

3 3 4 5 6 7
    ^-^

Here the solution is the segment [4, 5].

Finally we get that the solution of the initial task is the rectangle with formula:

 2 <= x <= 3; 4 <= y <= 5 

COMPLEXITY

As many people show interest in this post I decided to improve it a bit more.

Let's speak about complexity.

The complexity of the task is actually the same as the complexity of solving the simpler task (because as already discussed the solution actually consists of solving two simpler tasks). Many people will go and solve it via sorting and then choosing out the medians. However, this will cause O(nlog n) complexity, where n is the number of points in the input set.

This can be improved if a better algorithm for finding the kth element is used (Example of implementation in the C++ STL). This algorithm basically follows the same approach as qsort. The running time is O(n). Even in the case of two median points this will still remain linear (needing two runs of the same algorithm), and thus the total complexity of the algorithm becomes O(n). It is obvious that the task can not be solved any faster, as long as the input itself is of the mentioned complexity.

like image 133
Boris Strandjev Avatar answered Sep 29 '22 11:09

Boris Strandjev