Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear algorithm of finding tree diameter

Tags:

algorithm

tree

I have the following code to find the diameter of a tree:

ALGORITHM: TREE_DIAMETER (T)

maxlength ← 0
for s ← 0 to s < |V[T]|
      do temp ← BSF(T, S)
            if maxlength < temp
                   maxlength ← temp
                   increment s by 1
return maxlength

But this algorithm does not run in linear time. Preserving the details of the above code as much as possible (Eg: using BSF), is it possible to optimize it to linear time? You can use pseudo-code.

like image 962
user152356 Avatar asked Sep 03 '14 16:09

user152356


2 Answers

Here is a simple algorithm with linear time complexity:
1)Pick an arbitrary vertex v.
2)Find the furthest vertex from v using BFS(let's call it u).
3)Find the furthest vertex from u using BFS one more time(let's call it t).
Then distance(u, t) is the diameter.
BFS is called only twice, so the time complexity is linear.

like image 144
kraskevich Avatar answered Sep 26 '22 22:09

kraskevich


As supplementary to the other answer, a proof can be done as follow:

Let's first denote the two ends of the diameter in the tree as s and t, and the function d(u, v) denotes the distance between node u and node v.

Now we choose an arbitrary node X, then we have 2 cases:

  1. X is on the diameter;
  2. X is not on the diameter.

For case 1, it's easy to see that by doing (2) (the second step in the algorithm described in @user2040251's answer), we will get either d(X, s) or d(X, t). If we get something else, say d(X, u), then u and s or t can form a longer path than d(s, t), which contradicts the fact. Therefore, by doing (3), we will get d(s, t).

For case 2, by doing (2), we have 2 cases:

  1. The longest path starting from X contains at least 1 point on the diameter;
  2. The longest path starting from X doesn't share any points with the diameter.

For case 2.1, let's denote the first intersection as Y. Because of case 1, we know the longest path starting from Y must end at either s or t. Therefore, in this case, the longest path starting from X must end at either s or t. Consequently, by doing (3), we will get d(s, t).

For case 2.2, let's denote the other end of the longest path starting from X as Z. Since neither X or Z is on the diameter, and given X, Z, s, t are on the same tree, we can conclude that there must be a node V on the path X to Z, links to a node W on the path s to t. Because X to Z is the longest path starting from X, so d(X, V) + d(V, W) + d(W, t) < d(X, Z). Similarly, we have d(Z, V) + d(V, W) + d(W, s) < d(X, Z). Adding them up and simplify will give us: d(X, Z) > 2d(V, W) + d(s, t) > d(s, t), which contradicts with the fact that s to t is the diameter.

A graph that illustrates case 2.2:

s             Z
|             |
|             |
|             |
W-------------V
|             |
|             |
|             |
t             X

So we have:

d(X, V) + d(V, W) + d(W, t) < d(X, Z) because X to Z is the longest path starting from X;

d(Z, V) + d(V, W) + d(W, s) < d(X, Z) because X to Z is the longest path starting from Z;

Adding up 2 expressions:

d(X, Z) + 2d(V, W) + d(s, t) < 2d(X, Z)

Finally, we have d(X, Z) > 2d(V, W) + d(s, t) > d(s, t), which contradicts the fact.

like image 39
nevets Avatar answered Sep 25 '22 22:09

nevets