I am fetching my resut from a RSS feed using following code:
try:
desc = item.xpath('description')[0].text
if date is not None:
desc =date +"\n"+"\n"+desc
except:
desc = None
But sometimes the description contains html tags inside RSS feed as below:
This is samle text
< img src="http://imageURL" alt="" />
While displaying the content I do not want any HTML tags to be displayed on page. Is there any regular expression to remove the HTML tags.
Try:
pattern = re.compile(u'<\/?\w+\s*[^>]*?\/?>', re.DOTALL | re.MULTILINE | re.IGNORECASE | re.UNICODE)
text = pattern.sub(u" ", text)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With