Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing the inner HTML with BeautifulSoup?

For a tag:

<div class="special_tag"></div>

I want a tag.replace_inner_html('inner') that results in:

<div class="special_tag">inner</div>

The best I know is replace_with('inner') which replaces the outer HTML.

like image 964
Jesvin Jose Avatar asked Jan 27 '15 14:01

Jesvin Jose


People also ask

Can BeautifulSoup handle broken HTML?

BeautifulSoup is a Python package that parses broken HTML, just like lxml supports it based on the parser of libxml2.

Can BeautifulSoup parse HTML?

BeautifulSoup is a Python library for parsing HTML and XML documents. It is often used for web scraping. BeautifulSoup transforms a complex HTML document into a complex tree of Python objects, such as tag, navigable string, or comment.


Video Answer


1 Answers

If you want to replace inner text, set string attribute:

>>> from bs4 import BeautifulSoup
>>>
>>> soup = BeautifulSoup('''
... <div>
...     <div class="special_tag"></div>
... </div>
... ''')
>>> elem = soup.find(class_='special_tag')
>>> elem.string = 'inner'
>>> print(elem)
<div class="special_tag">inner</div>

If you want to add tag (or tags), you need to clear contents, and insert or append them (Use new_tag to create tags):

>>> elem = soup.find(class_='special_tag')
>>> elem.clear()
>>> elem.append('inner')
>>> print(elem)
<div class="special_tag">inner</div>
like image 130
falsetru Avatar answered Oct 02 '22 19:10

falsetru