Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup lowest common ancestor

Does the BeautifulSoup library for Python have any function that can take a list of nodes and return the lowest common ancestor?

If not, has any of you ever implemented such a function and care to share it?

like image 455
Ionut Hulub Avatar asked Oct 23 '25 15:10

Ionut Hulub


2 Answers

I think this is what you want, with link1 being one element and link2 being another;

link_1_parents = list(link1.parents)[::-1]
link_2_parents = list(link2.parents)[::-1]

common_parent = [x for x,y in zip(link_1_parents, link_2_parents) if x is y][-1]

print common_parent
print common_parent.name

It'll basically walk both elements' parents from root down, and return the last common one.

like image 168
Joachim Isaksson Avatar answered Oct 26 '25 04:10

Joachim Isaksson


The accepted answer does not work if the distance from a tag in the input list to the lowest common ancestor is not the exact same for every nodes in the input.

It also uses every ancestors of each node, which is unnecessary and could be very expensive in some cases.

import collections
def lowest_common_ancestor(parents=None, *args):
    if parents is None:
        parents = collections.defaultdict(int)
    for tag in args:
        if not tag:
            continue
        parents[tag] += 1
        if parents[tag] == len(args):
            return tag
    return lowest_common_ancestor(parents, *[tag.parent if tag else None for tag in args])
like image 28
Arthur Avatar answered Oct 26 '25 05:10

Arthur



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!