I have a soup in Python like this:
<p>
<span style="text-decoration: underline; color: #3366ff;">
Title:
</span>
Info
</p>
<p>
<span style="color: #3366ff;">
<span style="text-decoration: underline;">
Title2:
</span>
</span>
Info2
</p>
I'd like to get it to look like this:
<p>
Title:
Info
</p>
<p>
Title2:
Info2
</p>
Is there a way to do this with bs4?
You'll be wanting to use beautifulsoup's unwrap() for this.
import bs4
soup1 = bs4.BeautifulSoup(htm1, 'html.parser')
for match in soup1.findAll('span'):
match.unwrap()
print soup1
You can also use replace_with
to remove span tags:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for span_tag in soup.findAll('span'):
span_tag.replace_with('')
print(soup)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With