Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to only get inner text of a tag in BeautifulSoup, excluding the embedded one?

For example,

<ul>
    <li>
        <b>Hey, sexy!</b>
        Hello
    </li>
</ul>

I want only 'Hello' from the li tag.

If I use soup.find("ul").li.text It includes the b tag as well.

like image 481
Pranav Avatar asked Sep 15 '25 02:09

Pranav


1 Answers

You could use the find function like so

from bs4 import BeautifulSoup

html = '''<ul><li><b>Hey, sexy!</b>Hello</li></ul>'''
soup = BeautifulSoup(html)
print soup.find('li').find(text=True, recursive=False)
like image 98
Paul Rooney Avatar answered Sep 17 '25 20:09

Paul Rooney