Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm: Distance transform - any faster algorithm?

I'm trying to solve distance transform problem (using Manhattan's distance). Basically, giving matrix with 0's and 1's, program must assign distances of every position to nearest 1. For example, for this one

0000
0100
0000
0000

distance transform matrix is

2123
1012
2123
3234

Possible solutions from my head are:

Slowest ones (slowest because I have tried to implement them - they were lagging on very big matrices):

  1. Brute-force - for every 1 that program reads, change distances accordingly from beginning till end.

  2. Breadth-first search from 0's - for every 0, program looks for nearest 1 inside out.

  3. Same as 2 but starting from 1's mark every distance inside out.

  4. Much faster (read from other people's code)

    Breadth-first search from 1's

    1. Assign all values in the distance matrix to -1 or very big value.
    2. While reading matrix, put all positions of 1's into queue.
    3. While queue is not empty
        a. Dequeue position - let it be x
        b. For each position around x (that has distance 1 from it)
            if position is valid (does not exceed matrix dimensions) then
                if distance is not initialized or is greater than (distance of x) + 1 then
                    I. distance = (distance of x) + 1
                    II. enqueue position into queue
    

I wanted to ask if there is faster solution to that problem. I tried to search algorithms for distance transform but most of them are dealing with Euclidean distances.

Thanks in advance.

like image 702
Kudayar Pirimbaev Avatar asked May 06 '13 16:05

Kudayar Pirimbaev


2 Answers

The breadth first search would perform Θ(n*m) operations where n and m are the width and height of your matrix.

You need to output Θ(n*m) numbers, so you can't get any faster than that from a theoretical point of view.

I'm assuming you are not interested in going towards discussions involving cache and such optimizations.


Note that this solution works in more interesting cases. For example, imagine the same question, but there could be different "sources":

00000
01000
00000
00000
00010

Using BFS, you will get the following distance-to-closest-source in the same time complexity:

21234
10123
21223
32212
32101

However, with a single source, there is another solution that might have a slightly better performance in practice (even though the complexity is still the same).

Before, let's observe the following property.

Property: If source is at (a, b), then a point (x, y) has the following manhattan distance:

d(x, y) = abs(x - a) + abs(y - b)

This should be quite easy to prove. So another algorithm would be:

for r in rows
  for c in cols
    d(r, c) = abc(r - a) + abs(c - b)

which is very short and easy.


Unless you write and test it, there is no easy way of comparing the two algorithms. Assuming an efficient bounded queue implementation (with an array), you have the following major operations per cell:

  • BFS: queue insertion/deletion, visit of each node 5 times (four times by neighbors, and one time out of the queue)
  • Direct formula: two subtraction and two ifs

It would really depend on the compiler and its optimizations as well as the specific CPU and memory architecture to say which would perform better.

That said, I'd advise for going with whichever seems simpler to you. Note however that with multiple sources, in the second solution you would need multiple passes on the array (or multiple distance calculations in one pass) and that would definitely have a worse performance than BFS for a large enough number of sources.

like image 104
Shahbaz Avatar answered Nov 20 '22 19:11

Shahbaz


You don't need a queue or anything like that at all. Notice that if (i,j) is at distance d from (k,l), one way to realise that distance is to go left or right |i-k| times and then up or down |j-l| times.

So, initialise your matrix with big numbers and stick a zero everywhere you have a 1 in your input. Now do something like this:

for (i = 0; i < sx-1; i++) {
  for (j = 0; j < sy-1; j++) {
    dist[i+1][j] = min(dist[i+1][j], dist[i][j]+1);
    dist[i][j+1] = min(dist[i][j+1], dist[i][j]+1);
  }
  dist[i][sy-1] = min(dist[i][sy-1], dist[i][sy-2]+1);
}
for (j = 0; j < sy-1; j++) {
  dist[sx-1][j] = min(dist[sx-1][j], dist[sx-2][j]+1);
}

At this point, you've found all of the shortest paths that involve only going down or right. If you do a similar thing for going up and left, dist[i][j] will give you the distance from (i, j) to the nearest 1 in your input matrix.

like image 20
tmyklebu Avatar answered Nov 20 '22 18:11

tmyklebu