Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup: AttributeError: 'NavigableString' object has no attribute 'name'

Do you know why the first example in BeautifulSoup tutorial http://www.crummy.com/software/BeautifulSoup/documentation.html#QuickStart gives AttributeError: 'NavigableString' object has no attribute 'name'? According to this answer the space characters in the HTML causes the problem. I tried with sources of a few pages and 1 worked the others gave the same error (I removed spaces). Can you explain what does "name" refer to and why this error happens? Thanks.

like image 705
Zeynel Avatar asked Sep 29 '11 01:09

Zeynel


People also ask

What is NavigableString in Beautifulsoup?

A NavigableString object holds the text within an HTML or an XML tag. This is a Python Unicode string with methods for searching and navigation. Sometimes we may need to navigate to other tags or text within an HTML/XML document based on the current text.

Which of the objects of Beautifulsoup is not editable?

BeautifulSoupD. ParserCorrect Option : BEXPLANATION : You cannot edit the Navigable String object but can convert it into a Unicode stringusing the function Unicode.

Is comment an object of Beautifulsoup?

The four major and important objects are :Comments.

What does Beautiful Soup do?

Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.


1 Answers

name will refer to the name of the tag if the object is a Tag object (ie: <html> name = "html")

if you have spaces in your markup in between nodes BeautifulSoup will turn those into NavigableString's. So if you use the index of the contents to grab nodes, you might grab a NavigableString instead of the next Tag.

To avoid this, query for the node you are looking for: Searching the Parse Tree

or if you know the name of the next tag you would like, you can use that name as the property and it will return the first Tag with that name or None if no children with that name exist: Using Tag Names as Members

If you wanna use the contents you have to check the objects you are working with. The error you are getting just means you are trying to access the name property because the code assumes it's a Tag

like image 192
MattoTodd Avatar answered Sep 21 '22 10:09

MattoTodd