Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using lxml.html's cssselect to select element with colon in ID attribute

I have an element in a page that looks like this:

<a id="cid-694094:Comment:188384" name="694094:Comment:188384"></a>

If you do document.cssselect("#cid-694094:Comment:188384") you will get:

lxml.cssselect.ExpressionError: The psuedo-class Symbol(u'Comment', 12) is unknown

The solution for that is handled in this question (the person was using Java).

However, when I try that in Python as such:

document.cssselect(r"#cid-694094\:Comment\:188384")

I get:

lxml.cssselect.SelectorSyntaxError: Bad symbol 'cid-694094\': 'unicodeescape' codec can't decode byte 0x5c in position 10: \ at end of string at [Token(u'#', 0)] -> None

The reason for that and a proposed solution can be found in this question. If I understand it correctly I should be doing:

document.cssselect(r"#cid-694094\\:Comment\\:188384")

But this still doesn't work. Instead I once again get:

lxml.cssselect.ExpressionError: The psuedo-class Symbol(u'Comment\', 14) is unknown

Can anybody tell me what I'm doing wrong?

Try it yourself using:

import lxml.html
document = lxml.html.fromstring(
    '<a id="cid-694094:Comment:188384" name="694094:Comment:188384"></a>'
)
document.cssselect(r"#cid-694094\:Comment\:188384")
like image 963
Bruce van der Kooij Avatar asked Jun 09 '26 05:06

Bruce van der Kooij


1 Answers

Isn't : not allowed in css for id or class?

Here is a work-around:

document.xpath('//a[@id="cid-694094:Comment:188384"]')
like image 89
Ski Avatar answered Jun 10 '26 18:06

Ski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!