Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

beautiful soup getting tag.id

I'm attempting to get a list of div ids from a page. When I print out the attributes, I get the ids listed.

for tag in soup.find_all(class_="bookmark blurb group") :   print(tag.attrs) 

results in:

{'id': 'bookmark_8199633', 'role': 'article', 'class': ['bookmark', 'blurb', 'group']} {'id': 'bookmark_7744613', 'role': 'article', 'class': ['bookmark', 'blurb', 'group']} {'id': 'bookmark_7338591', 'role': 'article', 'class': ['bookmark', 'blurb', 'group']} {'id': 'bookmark_7338535', 'role': 'article', 'class': ['bookmark', 'blurb', 'group']} {'id': 'bookmark_4530078', 'role': 'article', 'class': ['bookmark', 'blurb', 'group']} 

So I know there ARE ids. However, when I print out tag.id instead, I just get a list of "None". What am I doing wrong here?

like image 738
klreeher Avatar asked Jul 25 '14 18:07

klreeher


People also ask

What is a tag in BeautifulSoup?

Going down. One of the important pieces of element in any piece of HTML document are tags, which may contain other tags/strings (tag's children). Beautiful Soup provides different ways to navigate and iterate over's tag's children.

How do I find the HTML element in BeautifulSoup?

Approach: Here we first import the regular expressions and BeautifulSoup libraries. Then we open the HTML file using the open function which we want to parse. Then using the find_all function, we find a particular tag that we pass inside that function and also the text we want to have within the tag.


2 Answers

You can access tag’s attributes by treating the tag like a dictionary (documentation):

for tag in soup.find_all(class_="bookmark blurb group") :     print tag.get('id') 

The reason tag.id didn't work is that it is equivalent to tag.find('id'), which results into None since there is no id tag found (documentation).

like image 168
alecxe Avatar answered Sep 23 '22 20:09

alecxe


This solution lists all tags with ids in a page , It might be helpful too.

tags = page_soup.find_all() for tag in tags:     if 'id' in tag.attrs:         print(tag.name,tag['id'],sep='->') 
like image 25
Thunder Avatar answered Sep 20 '22 20:09

Thunder