I'm looking more for links to mailing list discussions, etc. rather than speculation.
Can anyone help me find out the rationale behind the quoted error handling rules from the CSS Selectors Level 3 spec.
User agents must observe the rules for handling parsing errors:
- a simple selector containing an undeclared namespace prefix is invalid
- a selector containing an invalid simple selector, an invalid combinator or an invalid token is invalid.
- a group of selectors containing an invalid selector is invalid.
Specifications reusing Selectors must define how to handle parsing errors. (In the case of CSS, the entire rule in which the selector is used is dropped.)
I had the following rule:
#menu li.last, #menu li:last-child { ... }
To compensate for IE8's lack of last-child support, I used a class and a JavaScript shim. However, this didn't work because IE8 complies with the CSS spec on error handling, and discards the entire rule because it doesn't recognise one selector. This can be fixed by separating the two selectors in to individual rules.
Why is this desirable? Why doesn't the spec suggest simply discarding the unrecognised selector, but keeping the rest of the rule?
I'd like to know the rationale, as the rules currently seem counter-intuitive.
The :invalid selector selects form elements with a value that does not validate according to the element's settings.
The :invalid selector allows you to select <input> elements that do not contain valid content, as determined by its type attribute. :invalid is defined in the CSS Selectors Level 3 spec as a “validity pseudo-selector”, meaning it is used to style interactive elements based on an evaluation of user input.
The invalid selector error is a WebDriver error that occurs when an element retrieval command is used with an unknown web element selector strategy. The available selector strategies are CSS, link text, partial link text, tag name, and XPath. Any other selector strategy is rejected with this error.
It means that you can use new CSS as an enhancement, knowing that no error will occur if it is not understood — the browser will either get the new feature or not. This enables basic fallback styling. This works particularly well when you want to use a value that is quite new and not supported everywhere.
Why is this desirable? Why doesn't the spec suggest simply discarding the unrecognised selector, but keeping the rest of the rule?
The short answer is because it'd be too difficult for implementations to figure out what exactly constitutes "the rest of the rule" (or "the rest of the selector list" for that matter) without getting it wrong and inadvertently messing up layouts, as well as for consistency in error handling, and forward compatibility with future specifications.
I'll preface my long answer with a link to another answer of mine, on handling of invalid selectors. A comment on that answer points directly to section 4.1.7 of the CSS2.1 spec on dealing with errors in selectors within rule sets, which mentions commas in selectors as an example. I think it sums it up pretty nicely:
CSS 2.1 gives a special meaning to the comma (,) in selectors. However, since it is not known if the comma may acquire other meanings in future updates of CSS, the whole statement should be ignored if there is an error anywhere in the selector, even though the rest of the selector may look reasonable in CSS 2.1.
While the comma itself still means grouping two or more selectors as far as selectors are concerned, it turns out that Selectors 4 introduces new functional pseudo-classes that accept selector groups (or selector lists) as arguments, such as :matches()
(it even changes :not()
so it accepts a list, making it similar to :matches()
, whereas in level 3 it only accepts a single simple selector).
This means that not only will you find comma-separated groups of selectors associated with rules, but you'll start finding them within functional pseudo-classes as well (note that this is within a stylesheet only; outside of CSS, selectors can appear in JavaScript code, used by selector libraries and the native Selectors API).
Although not the only reason by far, this alone is enough to potentially over-complicate a parser's error handling rules with a huge risk of breaking the selector, the rule set, or even the layout. In the event of a parsing error with a comma, the parser will have trouble determining whether this selector group corresponds to an entire rule set, or part of another selector group, and how to handle the rest of the selector and its associated rule set accordingly. Instead of trying to guess, risk guessing wrongly and breaking the rule in some way (e.g. by matching and styling all the wrong elements), the safest bet is to discard the rule and move on.
As an example, consider the following rule, whose selector is valid in level 4 but not in level 3, taken from this question of mine:
#sectors > div:not(.alpha, .beta, .gamma) { color: #808080; background-color: #e9e9e9; opacity: 0.5; }
A naïve parser that doesn't understand Selectors 4 may try to split this into three distinct selectors that share the same declaration block, instead of a single selector with a pseudo-class that accepts a list, based on the commas alone:
#sectors > div:not(.alpha .beta .gamma)
If it simply discards the first and last selectors which are obviously invalid, leaving the second selector which is valid, should it then try to apply the rule to any elements with class beta
? It's clearly not what the author intends to do, so if a browser does that, it's going to do something unexpected to this layout. By discarding the rule with the invalid selector, the layout looks just a little saner, but that's an over-simplified example; rules with layout-altering styles can cause even bigger problems if applied wrongly.
Of course, other ambiguities in selector parsing can occur too, which can lead to the following situations:
All of which, again, are most easily resolved by discarding the rule set instead of playing the guessing game.
In the case of seemingly well-formed selectors that are unrecognized, such as :last-child
as a pseudo-class in your example, the spec makes no distinction between unrecognized selectors and selectors that are just plain malformed. Both result in a parsing error. From the same section that you link to:
Invalidity is caused by a parsing error, e.g. an unrecognized token or a token which is not allowed at the current parsing point.
And by making that statement about :last-child
I'm assuming the browser is able to parse a single colon followed by an arbitrary ident as a pseudo-class in the first place; in reality you can't assume that an implementation will know to parse :last-child
as a pseudo-class correctly, or something like :lang()
or :not()
with a functional notation since functional pseudo-classes didn't appear until CSS2.
Selectors defines a specific set of known pseudo-classes and pseudo-elements, the names of which are most likely hardcoded in every implementation. The most naïve of parsers have the entire notation for each pseudo-class and pseudo-element, including the single/double colon(s), hardcoded (I wouldn't be surprised if the major browsers actually do this with :before
, :after
, :first-letter
and :first-line
as a special case). So what may seem like a pseudo-class to one implementation might very well be gobbledygook to another.
Since there are so many ways for implementations to fail, the spec makes no distinction, making error handling much more predictable. If a selector is unrecognized, no matter whether it's because it's unsupported or malformed, the rule is discarded. Simple, straightforward, and easy enough to get your head around.
All that said, there is at least one discussion in the www-style public mailing list suggesting that the specification be changed because it may not be so difficult to implement error handling by splitting selectors after all.
I should also mention that some layout engines behave differently, such as WebKit ignoring non-WebKit-prefixed selectors in a rule, applying its own prefixes, while other browsers ignore the rule completely (you can find more examples on Stack Overflow; here's a slightly different one). In a way you could say WebKit is skirting the rule as it is, although it does try to parse comma-separated selector groups smartly in spite of those prefixed selectors.
I don't think the working group has a compelling reason to change this behavior yet. In fact, if anything, they have a compelling reason not to change it, and that's because sites have been relying on this behavior for many years. In the past, we had selector hacks for filtering older versions of IE; today, we have prefixed selectors for filtering other browsers. These hacks all rely on the same behavior of certain browsers discarding rules they don't recognize, with other browsers applying them if they think they're correct, e.g. by recognizing prefixes (or throwing only the unrecognized ones out, as WebKit does). Sites could break in newer versions of those browsers if this rule were to change, which absolutely cannot happen in such a diversified (read: fragmented) Web as ours.
As of April 2013, it was decided in a telecon that this behavior remain unchanged for the reason I've postulated above:
- RESOLVED: Do not adopt MQ-style invalidation for Selectors due to Web-compat concerns.
Media query-style invalidation refers to invalid media queries in a comma-separated list not breaking the entire @media
rule.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With