I am using python + BeautifulSoup to parse an HTML document.
Now I need to replace all <h2 class="someclass">
elements in an HTML document, with <h1 class="someclass">
.
How can I change the tag name, without changing anything else in the document?
Notes. To replace a tag in Beautful Soup, find the element then call its replace_with method passing in either a string or tag.
The navigablestring object is used to represent the contents of a tag. To access the contents, use “. string” with tag. You can replace the string with another string but you can't edit the existing string.
You can just use the replace method on the string. >>> s = 'This is an [[example]] sentence. It is [[awesome]].
I don't know how you're accessing tag
but the following works for me:
import BeautifulSoup
if __name__ == "__main__":
data = """
<html>
<h2 class='someclass'>some title</h2>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
<li>Vestibulum auctor dapibus neque.</li>
</ul>
</html>
"""
soup = BeautifulSoup.BeautifulSoup(data)
h2 = soup.find('h2')
h2.name = 'h1'
print soup
Output of print soup
command is:
<html>
<h1 class='someclass'>some title</h1>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
<li>Vestibulum auctor dapibus neque.</li>
</ul>
</html>
As you can see, h2
became h1
. And nothing else in the document changed. I am using Python 2.6 and BeautifulSoup 3.2.0.
If you have more than one h2
and you want to change them all, you could simple do:
soup = BeautifulSoup.BeautifulSoup(your_data)
while True:
h2 = soup.find('h2')
if not h2:
break
h2.name = 'h1'
It's just:
tag.name = 'new_name'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With