Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup: get tag name of element itself, not its children

I have the below (simplified) code, which uses the following source:

<html>     <p>line 1</p>     <div>         <a>line 2</a>     </div> </html>  soup = BeautifulSoup('<html><p>line 1</p><div><a>line 2</a></div></html>') ele = soup.find('p').nextSibling somehow_print_tag_of_ele_here 

I want to get the tag of ele, in this case "div". However, I only seem to be able to get the tag of its children. Am I missing something simple? I thought that I could do ele.tag.name, but that is an exception since tag is None.

#Below correctly prints the div element "<div><a>line 2</a></div>" print ele  #Below prints "None". Printing tag.name is an exception since tag is None print ele.tag   #Below prints "a", the child of ele allTags = ele.findAll(True) for e in allTags:     print e.name 

At this point, I am considering doing something along the way of getting the parent of ele, then getting the tags of parent's children and, having counted how many upper siblings ele has, counting down to the correct child tag. That seems ridiculous.

like image 460
user984003 Avatar asked Dec 16 '11 11:12

user984003


People also ask

How do you get all the direct children of a BeautifulSoup tag?

To get all immediate children in Beautiful Soup, use the find_all(recursive=False) method.


1 Answers

ele is already a tag, try doing this:

soup = BeautifulSoup('<html><p>line 1</p><div><a>line 2</a></div></html>') print(soup.find('p').nextSibling.name) 

so in your example it would be just

print(ele.name) 
like image 102
Sebastian Piu Avatar answered Oct 22 '22 20:10

Sebastian Piu