I want a regex which can match conditional comments in a HTML source page so I can remove only those. I want to preserve the regular comments.
I would also like to avoid using the .*? notation if possible.
The text is
foo
<!--[if IE]>
<style type="text/css">
ul.menu ul li{
font-size: 10px;
font-weight:normal;
padding-top:0px;
}
</style>
<![endif]-->
bar
and I want to remove everything in <!--[if IE]>
and <![endif]-->
EDIT: It is because of BeautifulSoup I want to remove these tags. BeautifulSoup fails to parse and gives an incomplete source
EDIT2: [if IE] isn't the only condition. There are lots more and I don't have any list of all possible combinations.
EDIT3: Vinko Vrsalovic's solution works, but the actual problem why beautifulsoup failed was because of a rogue comment within the conditional comment. Like
<!--[if lt IE 7.]>
<script defer type="text/javascript" src="pngfix_253168.js"></script><!--png fix for IE-->
<![endif]-->
Notice the <!--png fix for IE-->
comment?
Though my problem was solve, I would love to get a regex solution for this.
>>> from BeautifulSoup import BeautifulSoup, Comment
>>> html = '<html><!--[if IE]> bloo blee<![endif]--></html>'
>>> soup = BeautifulSoup(html)
>>> comments = soup.findAll(text=lambda text:isinstance(text, Comment)
and text.find('if') != -1) #This is one line, of course
>>> [comment.extract() for comment in comments]
[u'[if IE]> bloo blee<![endif]']
>>> print soup.prettify()
<html>
</html>
>>>
python 3 with bf4:
from bs4 import BeautifulSoup, Comment
html = '<html><!--[if IE]> bloo blee<![endif]--></html>'
soup = BeautifulSoup(html, "html.parser")
comments = soup.findAll(text=lambda text:isinstance(text, Comment)
and text.find('if') != -1) #This is one line, of course
[comment.extract() for comment in comments]
[u'[if IE]> bloo blee<![endif]']
print (soup.prettify())
If your data gets BeautifulSoup confused, you can fix it before hand or customize the parser, among other solutions.
EDIT: Per your comment, you just modify the lambda passed to findAll as you need (I modified it)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With