Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extracting element and insert a space

im parsing html using BeautifulSoup in python

i dont know how to insert a space when extracting text element

this is the code:

import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')
print soup.text

then output is

thisisexample

but i want to insert a space to this like

yes is example

how do i insert a space?

like image 576
lumiere Avatar asked Jun 24 '11 11:06

lumiere


2 Answers

Use getText instead:

import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')

print soup.getText(separator=u' ')
# u'this is example'
like image 116
mouad Avatar answered Sep 20 '22 07:09

mouad


If your version of Beautifulsoup does not have getText then you could do this:

In [26]: ' '.join(soup.findAll(text=True))
Out[26]: u'this is example'
like image 33
unutbu Avatar answered Sep 20 '22 07:09

unutbu