Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decode html entities using BeautifulSoup

I am trying to decode entities using BeautifulSoup but with no luck.

from BeautifulSoup import BeautifulSoup

decoded = BeautifulSoup("<p> </p>",convertEntities=BeautifulSoup.HTML_ENTITIES)

print decoded

The output is not decoded at all. I found a lot of answers here that use this method. Am I a doing something wrong?

I would like to use BeautifulSoup for this so please don't bother telling me that the standard library has a method to decode entities.

like image 393
kechap Avatar asked Nov 04 '22 02:11

kechap


1 Answers

You need to print decoded.contents:

>>> print decoded
<p> </p>
>>> print decoded.contents
[u'<p> </p>']
like image 126
Gabi Purcaru Avatar answered Nov 15 '22 13:11

Gabi Purcaru