Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use BeautifulSoup to replace a tag with its contents?

How would I use BeautifulSoup to remove only a tag? The method I found deletes the tag and all other tags and content inside it. I want to remove only the tag and leave everything inside it untouched, e.g.

change this:

<div>
<p>dvgbkfbnfd</p>
<div>
<span>dsvdfvd</span>
</div>
<p>fvjdfnvjundf</p>
</div>

to this:

<p>dvgbkfbnfd</p>
<span>dsvdfvd</span>
<p>fvjdfnvjundf</p>
like image 473
Blainer Avatar asked May 11 '12 17:05

Blainer


People also ask

How do you replace a tag in BeautifulSoup?

To replace a tag in Beautful Soup, find the element then call its replace_with method passing in either a string or tag.

How do you replace a tag in Python?

You can just use the replace method on the string. >>> s = 'This is an [[example]] sentence. It is [[awesome]].

How do you extract text from a tag in BeautifulSoup?

To extract text that is directly under an element in Beautiful Soup use the find_all(text=True, recursive=False) method. Here, note the following: The text=True means to look for text instead of elements.

How do you replace text in HTML in Python?

If the text and the string to replace is simple then use str. replace().


1 Answers

I've voted to close as a duplicate, but in case it's of use, reapplying slacy's answer from top related answer on the right gives you this solution:

from BeautifulSoup import BeautifulSoup

html = '''
<div>
<p>dvgbkfbnfd</p>
<div>
<span>dsvdfvd</span>
</div>
<p>fvjdfnvjundf</p>
</div>
'''

soup = BeautifulSoup(html)
for match in soup.findAll('div'):
    match.replaceWithChildren()

print soup

... which produces the output:

<p>dvgbkfbnfd</p>

<span>dsvdfvd</span>

<p>fvjdfnvjundf</p>
like image 167
Mark Longair Avatar answered Nov 14 '22 20:11

Mark Longair