Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python lxml: syntax for selectively deleting inline style attributes?

I'm using python 3.4 with the lxml.html library.

I'm trying to remove the border-bottom in-line styling from html elements that I've targeted with a css selector.

Here's a code fragment showing a sample td element and my selector:

html_snippet = lxml.html.fromstring("""<td valign="bottom" colspan="10" align="center" style="background-color:azure; border-bottom:1px solid #000000"><font style="font-family:Times New Roman" size="2">Estimated Future Payouts</font> \n            <br/><font style="font-family:Times New Roman" size="2">Under Non-Equity Incentive</font> \n            <br/><font style="font-family:Times New Roman" size="2">Plan Awards</font> \n        </td>""")
selection = html_snippet.cssselect('td[style*="border-bottom"]')
selection.attrib['style']
>>>>'background-color: azure;border-bottom:1px solid #000000'

What's the proper way to access the in-line style properties so I can remove the border-bottom attribute from any element I target with my selector?

like image 708
deseosuho Avatar asked May 27 '26 09:05

deseosuho


2 Answers

You can approach it by splitting the style attribute value by ;, create a CSS property name -> value map, remove the border-bottom from the map and reconstruct the style attribute again by joining the elements of the map with ;. Sample implementation:

style = selection.attrib['style']
properties = dict([item.split(":") for item in style.split("; ")])

del properties['border-bottom']

selection.attrib['style'] = "; ".join([key + ":" + value for key, value in properties.items()])

print(lxml.html.tostring(selection))

I'm pretty sure you can break this solution easily.


Alternatively, here is a rather "crazy" option - dump the data into the "html" file, open the file in a browser via selenium, remove the attribute via javascript and print out the HTML representation of the element after:

import os
from selenium import webdriver   

data = """
<td valign="bottom" colspan="10" align="center" style="background-color:azure; border-bottom:1px solid #000000"><font style="font-family:Times New Roman" size="2">Estimated Future Payouts</font> \n            <br/><font style="font-family:Times New Roman" size="2">Under Non-Equity Incentive</font> \n            <br/><font style="font-family:Times New Roman" size="2">Plan Awards</font> \n        </td>
"""
with open("index.html", "w") as f:
    f.write("<body><table><tr>%s</tr></table></body>" % data)

driver = webdriver.Chrome()
driver.get("file://" + os.path.abspath("index.html"))

td = driver.find_element_by_tag_name("td")
driver.execute_script("arguments[0].style['border-bottom'] = '';", td)

print(td.get_attribute("outerHTML"))

driver.close()

Prints:

<td valign="bottom" colspan="10" align="center" style="background-color: rgb(240, 255, 255);"><font
        style="font-family:Times New Roman" size="2">Estimated Future Payouts</font>
    <br><font style="font-family:Times New Roman" size="2">Under Non-Equity Incentive</font>
    <br><font style="font-family:Times New Roman" size="2">Plan Awards</font>
</td>
like image 135
alecxe Avatar answered May 31 '26 18:05

alecxe


There is a package for that, although overkill in this case.

import cssutils
sheet = cssutils.parseStyle('background-color: azure;border-bottom:1px solid #000000')
sheet.removeProperty('border-bottom')  # returns '1px solid #000'
print(sheet.cssText)

Outputs background-color: azure

like image 43
headuck Avatar answered May 31 '26 17:05

headuck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!