im parsing html using BeautifulSoup in python
i dont know how to insert a space when extracting text element
this is the code:
import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')
print soup.text
then output is
thisisexample
but i want to insert a space to this like
yes is example
how do i insert a space?
Use getText
instead:
import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')
print soup.getText(separator=u' ')
# u'this is example'
If your version of Beautifulsoup does not have getText
then you could do this:
In [26]: ' '.join(soup.findAll(text=True))
Out[26]: u'this is example'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With