I'm trying to add a new link as an unordered list element.
But I can't add a tag inside another with Beautiful Soup.
with open('index.html') as fp:
soup = BeautifulSoup(fp, 'html.parser')
a = soup.select_one("id[class=pr]")
ntag1 = soup.new_tag("a", href="hm/test")
ntag1.string = 'TEST'
... (part with problem)
a.insert_after(ntag2)
ntag1 must stay inside "<li>"
, so I tried
ntag2 = ntag1.new_tag('li')
TypeError: 'NoneType' object is not callable
with wrap()
ntag2 = ntag1.wrap('li')
ValueError: Cannot replace one element with another when theelement to be replaced is not part of a tree.
Original HTML
<id class="pr">
</id>
<li>
<a href="pr/protocol">
protocol
</a>
Desirable html output
<id class="pr">
</id>
<li>
<a href="hm/test">
TEST
</a>
</li>
<li>
<a href="pr/protocol">
protocol
</a>
</li>
Why you get a NoneType
error is because ntag2 = ntag1.new_tag('li')
is trying to call a method the Tag
object doesn't have.
The Cannot replace one element with another when theelement
is from the fact you have created a tag that has no association to the tree, it has no parent which it must have if you are trying to wrap.
It would make more sense to create the parent li and just append the anchor child:
html = """<div class="pr">
</div>
<li>
<a href="pr/protocol">
protocol
</a>
</li>"""
soup = BeautifulSoup(html, "lxml")
a = soup.select_one("div[class=pr]")
# Li parent
parent = soup.new_tag("li", class_="parent")
# Child anchor
child = soup.new_tag("a", href="hm/test", class_="child")
child.string = 'TEST'
# Append child to parent
parent.append(child)
# Insert parent
a.insert_after(parent)
print(soup.prettify())
which would give you the output you want bar the html not being valid.
If you have an actual ul you want to get to after a certain element, i.e.
html = """<div class="pr">
</div>
<ul>
<li>
<a href="pr/protocol">
protocol
</a>
</li>
</ul>
"""
Set a's css selector to div[class=pr] + ul"
and insert the parent:
a = soup.select_one("div[class=pr] + ul")
.....
a.insert(0, parent)
print(soup.prettify())
Which would give you:
<html>
<body>
<div class="pr">
</div>
<ul>
<li class_="parent">
<a class_="child" href="hm/test">
TEST
</a>
</li>
<li>
<a href="pr/protocol">
protocol
</a>
</li>
</ul>
</body>
</html>
Of if you wanted to wrap one existing tag:
from bs4 import BeautifulSoup, Tag
html = """<div class="pr">
</div>
<a href="pr/protocol">
protocol
"""
soup = BeautifulSoup(html, "lxml")
a = soup.select_one("div[class=pr] + a")
a.wrap(Tag(name="div"))
print(soup.prettify())
Which would wrap the existing anchor:
<html>
<body>
<div class="pr">
</div>
<div>
<a href="pr/protocol">
protocol
</a>
</div>
</body>
</html>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With