Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Beautiful Soup 'NavigableString' object has no attribute 'get_text'

I'm trying to extract the text inside from the following html structure:

<div class="account-places">
    <div>
        <ul class="location-history">
            <li></li>
            <li>Text to extract</li>
        </ul>
    </div>
</div>

I have the following BeautifulSoup code to do it:

from bs4 import BeautifulSoup as bs

soup = bs(html, "lxml")
div = soup.find("div", {"class": "account-places"})
text = div.div.ul.li.next_sibling.get_text()

But Beautiful Soup is throwing the error: 'NavigableString' object has no attribute 'get_text'. What am I doing wrong?

like image 734
Brinley Avatar asked Jun 05 '18 16:06

Brinley


Video Answer


2 Answers

Looks like you need find_next_sibling("li").

Ex:

from bs4 import BeautifulSoup as bs

soup = bs(html, "lxml")
div = soup.find("div", {"class": "account-places"})
text = div.div.ul.li.find_next_sibling("li").get_text()
print(text)

Output:

Text to extract
like image 131
Rakesh Avatar answered Sep 28 '22 01:09

Rakesh


Since the next_siblingcall returns a NavigableString, you have to follow that syntax:

text = unicode(div.div.ul.li.next_sibling)

To quote the documentation:

A NavigableString is just like a Python Unicode string, except that it also supports some of the features described in Navigating the tree and Searching the tree. You can convert a NavigableString to a Unicode string with unicode()

like image 37
Eliot K Avatar answered Sep 28 '22 02:09

Eliot K