How can I wrap <div data-role="content"></div>
around the contents of html body with beautiful soup?
I tried to start with the following but haven't been able to make any progress:
from bs4 import BeautifulSoup
soup = BeautifulSoup(u"%s" % response)
wrapper = soup.new_tag('div', **{"data-role":"content"})
soup.body.append(wrapper)
for content in soup.body.contents:
wrapper.append(content)
I also tried using body.children but no luck.
This appends the wrapper to the body, but doesn't wrap the body contents like I need
-- edit --
I've gotten to here, but now I end up with duplicate body elements like this <body><div data-role="content"><body>content here</body></div></body>
from bs4 import BeautifulSoup
soup = BeautifulSoup(u"%s" % response)
wrapper = soup.new_tag('div', **{"data-role":"content"})
new_body = soup.new_tag('body')
contents = soup.body.replace_with(new_body)
wrapper.append(contents)
new_body.append(wrapper)
How about this?
from bs4 import BeautifulSoup
soup = BeautifulSoup(unicode(response))
wrapper = soup.new_tag('div', **{"data-role":"content"})
body_children = list(soup.body.children)
soup.body.clear()
soup.body.append(wrapper)
for child in body_children:
wrapper.append(child)
I recently hit upon this same situation, and I'm not content with any of the other answers here. Iterating through a massive list and rebuilding the DOM doesn't seem acceptable to me performance-wise, and the other solution wraps the body, not the body's contents. Here's my solution:
soup.body.wrap(soup.new_tag("div", **{"data-role": "content"})).wrap(soup.new_tag("body"))
soup.body.body.unwrap()
Very simply, this approach just wraps the body twice, first with the new tag, then another body. Then I use BeautifulSoup's unwrap method to delete the original body while maintaining the contents.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With