Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python string replace digits

I am trying to replace certain parts of the string below.

'''<td align="center"> 5 </td> <td> align="center"> 0.0001 </td>'''

I need to remove the <td> tag if there is a '0.'(decmial occurrence). i.e. the output should be

'''<td align="center"> 5 </td>'''

I have tried this

data = ' '.join(data.split())<br>
l = data.replace('<td align="center"> 0.r"\d" </td>', "")

but didn't succeed. Could anyone please help me with doing this.

Thanks in advance

like image 552
funnyguy Avatar asked Dec 06 '22 16:12

funnyguy


1 Answers

While both of the regular expression examples work, I would advice against using regexp.

Especially if the data is a full html document, you should go for html-aware parser, such as lxml.html e.g.:

from lxml import html
t = html.fromstring(text)
tds = t.xpath("table/tbody/tr[2]/td")
for td in tds:
    if tds.text.startswith("0."):
        td.getparent().remove(td)
text = html.tostring(t)
like image 196
Kimvais Avatar answered Dec 09 '22 16:12

Kimvais