I was just studying up on some Google Interview Questions and I stumbled upon this one online that I can't seem to even understand
Consider an undirected tree with N nodes, numbered from 1 to N. Each node has a label associated with it, which is an integer value. Different nodes can have the same label. Write a function that, given a zero indexed array A of length N, where A[j] is the label value of the (j + 1)-th node in the tree and a zero-indexed array E of length K = (N – 1) * 2 in which the edges of the tree are described, returns the length of the longest path such that all the nodes on that path have the same label. The length is the number of edges in that path.
Example:
A = [1, 1, 1, 2, 2] E = [1, 2, 1, 3, 2, 4, 2, 5]
This tree is shown below. A node follows the form label, value.
----------1, 1
-----1, 2 1, 3
2, 4 2, 5
The function should return 2, because the longest path is 2->1->3, and there are 2 edges in this path.
Assume that 1 <= N <= 1000 and each element of the array A is an integer in the range [1, 1000000000].
What is your take on breaking this up to solve? Thanks!
Explanation: The longest path in a randomized binary search tree, but it has been found that the longest length is around 4.311 log x for node x. This is also equal to 1/β log x where β lies in the range (0, 1).
We can find the longest path using two BFSs. The idea is based on the following fact: If we start BFS from any node x and find a node with the longest distance from x, it must be an endpoint of the longest path. It can be proved using contradiction.
The diameter/width of a tree is defined as the number of nodes on the longest path between two end nodes.
Here is my suggestion. I haven't checked it thoroughly.
from collections import defaultdict
A = [1, 1, 1, 2, 2]
E = [1, 2, 1, 3, 2, 4, 2, 5]
d = defaultdict(list)
it = iter(E)
for k, v in zip(it, it):
d[k].append(v)
d[v].append(k)
leaves = [k for k, v in d.items() if len(v) == 1]
def len_path(node, length=0, coming_from=None):
max_loc_len = length
loc_data = A[node-1]
available_nodes = {i: A[i-1] for i in d[node]}
available_nodes.pop(coming_from, None)
for i, i_data in available_nodes.items():
cumul = length + 1 if loc_data == i_data else 0
loc_len = len_path(i, cumul, node)
if loc_len > max_loc_len:
max_loc_len = loc_len
return max_loc_len
max_len = max([len_path(leaf) for leaf in leaves])
And a more sophisticated search, avoiding computing the same path twice:
# ...
def len_path(node, length=0, coming_from=None, fork=False):
max_loc_len = length
loc_data = A[node-1]
available_nodes = {i: A[i-1] for i in d[node]}
available_nodes.pop(coming_from, None)
matching_nodes = [k for k, v in available_nodes.items()
if v == loc_data and k not in leaves[:explored]]
if len(matching_nodes) > 1:
fork = True
if not available_nodes and not fork and node in leaves[explored:]:
# Reached an unexplored leaf without forks on the path,
# hence no need to explore it later
leaves.remove(node)
for i, i_data in available_nodes.items():
cumul = length + 1 if loc_data == i_data else 0
loc_len = len_path(i, cumul, node, fork)
if loc_len > max_loc_len:
max_loc_len = loc_len
return max_loc_len
explored = 0
max_len =0
while explored < len(leaves):
length = len_path(leaves[explored])
if length > max_len:
max_len = length
explored += 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With