I have a set of n nodes on a (non-binary) tree. I want to find the maximum of the distances between any two of the nodes. (I define the distance between two nodes to be the sum of the distances between those nodes and their lowest common ancestor). I can easily solve this problem in O(n^2) by just computing the distance between each node to each other node and getting the maximum, however I'm hoping for something better as this is far too slow* for my application scenario. (Extra info: In my application scenario, these nodes are actually files and the tree is a directory structure. Therefore, the tree is quite shallow (depth < ~10), but it may have 300,000+ nodes. The sets of files can be anywhere between ~3-200 in size. Effectively, I'm trying to figure out how far spread out are my files in each set.)* Edit: Perhaps I can ask an equivalent question to prompt more answers: Consider a subset of the original tree that only contains the nodes in my set and the nodes necessary to connect them. Then the question becomes: How do I find the longest simple path in an undirected acyclical graph? *Edit 2: As didierc pointed out, I actually should be considering sets of folders not files. This makes my sets smaller and the exhaustive approach may be fast enough. Still, it would be beneficial to see a faster solution, and I'm curious to see if there is one.

I have a simple O(n) greedy algorithm for this interesting problem. <h3>Algorithm</h3> <ol> <li>Select arbitrary vertice X as the root of the tree, and then find the vertice Y who has the maximum distance with the root X. The complexity of this step is O(n).</li> <li>Make vertice Y as the new root of the tree, then find the vertice Z who has the maximum distance with the root Y. And the distance between Y and Z is the maximum of the distances in the tree. The complexity of this step is also O(n).</li> <li>The total complexity of this greedy algorithm is O(n).</li> </ol> <h3>Proof</h3> <ul> <li>It is obviously that Y and Z form one diameter of the tree, and we call Y and Z a pair of corners of the tree.</li> <li> Theorem: For each vertice P in the tree, either Y or Z would be the vertice who has the maximum distance with it.</li> <li>The step 1 of the algorithm is based on Theorem, so we can easily get one corner (Y) of the tree.</li> <li>The second corner Z is found based on the Theorem as well.</li> </ul> <h3>Extend</h3> According to the Theorem in the proof, we can solve another more challenging problem: For each vertice in the tree, calculate who is fareast vertice to it. <ul> <li>We can find the two corners of the tree in O(n) complexity, and then we can use the Theorem again.</li> <li>From corner Y and Z we do dfs respectively, and for each vertices p[i] we can get the distance to Y and Z(we call them disY[i] and disZ[i]), so the fareast distance for p[i] is max(disY[i], disZ[i]). Due to we just do dfs twice, so we can get the infomation in O(n) complexity.</li> <li>This extended problem can also solved by complexible tree dynamic programming whose complexity is also O(n).</li> </ul>

How to find the max distance between a set of nodes on a tree?

Q: How do you find the maximum distance between two nodes in a tree?

The distance between two nodes can be obtained in terms of lowest common ancestor. Following is the formula. Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca) 'n1' and 'n2' are the two given keys 'root' is root of given Binary Tree.

Q: What is the distance between nodes?

The distance between two nodes is defined as the total number of edges in the shortest path from one node to other. For example, consider the binary tree. The distance between node 7 and node 6 is 3. This problem is a standard application of the lowest common ancestor of given nodes.

Q: How do you find the maximum number of nodes?

If binary tree has height h, maximum number of nodes will be when all levels are completely full. Total number of nodes will be 2^0 + 2^1 + …. 2^h = 2^(h+1)-1. For example, the binary tree shown in Figure 2(b) with height 2 has 2^(2+1)-1 = 7 nodes.

Q: How do you find the distance between two nodes on a graph?

Starts from the first node and then keep hopping from the current set of nodes until you reach the target. The point of visited-node list is to prevent visiting the visited nodes, resulting in a loop. And to get shortest distance, it's no use to make a revisit as it always makes the distance of resulting path longer.

Tags:

algorithm

graph

tree

I have a set of n nodes on a (non-binary) tree. I want to find the maximum of the distances between any two of the nodes. (I define the distance between two nodes to be the sum of the distances between those nodes and their lowest common ancestor).

I can easily solve this problem in O(n^2) by just computing the distance between each node to each other node and getting the maximum, however I'm hoping for something better as this is far too slow* for my application scenario.

(Extra info: In my application scenario, these nodes are actually files and the tree is a directory structure. Therefore, the tree is quite shallow (depth < ~10), but it may have 300,000+ nodes. The sets of files can be anywhere between ~3-200 in size. Effectively, I'm trying to figure out how far spread out are my files in each set.)*

Edit: Perhaps I can ask an equivalent question to prompt more answers: Consider a subset of the original tree that only contains the nodes in my set and the nodes necessary to connect them. Then the question becomes: How do I find the longest simple path in an undirected acyclical graph?

*Edit 2: As didierc pointed out, I actually should be considering sets of folders not files. This makes my sets smaller and the exhaustive approach may be fast enough. Still, it would be beneficial to see a faster solution, and I'm curious to see if there is one.

398

asked Nov 21 '12 20:11

Daryl

2 Answers

Your problem is also known as finding the diameter of a tree: among all shortest paths between pairs of nodes, you're seeking the longest.

Denote the diameter of a tree S by d(S), and its height by h(S).

The two most distant nodes in a tree S with subtrees S1...Sd can be either under one of its subtrees, or they might span two subtrees. In the first case, when the two most distant nodes are under subtree Si, d(S) is just d(Si). In the second case, when the two most distance nodes span two subtrees, say Si and Sj, their distance is h(Si) + h(Sj) + 2 because the two nodes must be the two deepest nodes in each subtree, plus two more edges to join the two subtrees. In fact, in this second case, Si and Sj must be the heighest and second heighest subtree of S.

An O(n) algorithm would proceed as follows

Algorithm d(S)

1. recursively compute d(S1)...d(Sd) and h(S1)...h(Sd) of the subtrees of S.
2. denote by Si be the deepest subtree and Sj the second deepest subtree
3. return max(d(S1), ..., d(Sd), h(Si)+h(Sj)+2)

Analysis

Lines 2 and 3 each take O(d) time to compute. But each node is examined only once by these lines, so in recursion, this takes a total of O(n).

148

answered Oct 17 '22 21:10

moos

I have a simple O(n) greedy algorithm for this interesting problem.

Algorithm

Select arbitrary vertice X as the root of the tree, and then find the vertice Y who has the maximum distance with the root X. The complexity of this step is O(n).
Make vertice Y as the new root of the tree, then find the vertice Z who has the maximum distance with the root Y. And the distance between Y and Z is the maximum of the distances in the tree. The complexity of this step is also O(n).
The total complexity of this greedy algorithm is O(n).

Proof

It is obviously that Y and Z form one diameter of the tree, and we call Y and Z a pair of corners of the tree.
Theorem: For each vertice P in the tree, either Y or Z would be the vertice who has the maximum distance with it.
The step 1 of the algorithm is based on Theorem, so we can easily get one corner (Y) of the tree.
The second corner Z is found based on the Theorem as well.

Extend

According to the Theorem in the proof, we can solve another more challenging problem: For each vertice in the tree, calculate who is fareast vertice to it.

We can find the two corners of the tree in O(n) complexity, and then we can use the Theorem again.
From corner Y and Z we do dfs respectively, and for each vertices p[i] we can get the distance to Y and Z(we call them disY[i] and disZ[i]), so the fareast distance for p[i] is max(disY[i], disZ[i]). Due to we just do dfs twice, so we can get the infomation in O(n) complexity.
This extended problem can also solved by complexible tree dynamic programming whose complexity is also O(n).

answered Oct 17 '22 20:10

Jun HU

Related questions
                            
                                Using a QuadTree to get all points within a bounding circle
                            
                                how to implement back and forward functionality like browser
                            
                                Android Mapview: Merging overlapping markers into a new marker
                            
                                For parallel algorithm with N threads, can performance gain be more than N?
                            
                                given an array, for each element, find out the total number of elements lesser than it, which appear to the right of it
                            
                                Generalised Two-Egg Puzzle
                            
                                Finding greatest sum of elements of array which is divisible by a given number
                            
                                Implementing Stack with Python
                            
                                How to generate random graphs?
                            
                                Finding the shortest path nodes with breadth first search
                            
                                Sorting a list with a comparison function that doesn't follow 'strict weak ordering'
                            
                                What is a data structure kind of like a hash table, but infrequently-used keys are deleted?
                            
                                Algorithms to identify Markov generated content?
                            
                                Algorithm to make halftone images?
                            
                                A good algorithm for generating an order number
                            
                                Algorithm to calculate a page importance based on its views / comments
                            
                                Algorithm for almost similar values search
                            
                                find overlapping rectangles algorithm
                            
                                Where might I find a method to convert an arbitrary boolean expression into conjunctive or disjunctive normal form?
                            
                                Check if a string is rotation of another WITHOUT concatenating

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With