Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unnecessary asterisk in regex that finds CSS comment

I thought to ask this as an update to my previous similar question but it became too long.

I was trying to understand a regex given in w3.org that matches css comments and got this doubt

Why do they use

\/\*[^*]*\*+([^/*][^*]*\*+)*\/
----------------^

instead of just

\/\*[^*]*\*+([^/][^*]*\*+)*\/

?

Both are working similarly. Why do they have an extra star there?

  1. Let's look at this part:

    \*+([^/*][^*]*\*+)*
    -A- --B--     -C-
    

    Regex engine will parse the A part and match all the stars until there is NO MORE stars or there is a line break. So once A is done, the next character must be a line break or anything else that's not a star. Then why instead of using [^/] they used [^/*]?

  2. Also look at the repeating capturing group.

    ([any one char that's not / or *][zero or more chars that's not *][one or more stars])

    It captures groups of characters ending with atleast one or more stars. So C will take all the stars leaving B with no stars to match in the next round.

    So the B part won't get a chance to meet any stars at all. That is why I think there's no need to put a star there.

But that regex is in w3.org so I guess my understanding may be wrong. Please explain what I'm missing.

like image 844
Vigneshwaran Avatar asked Oct 08 '22 19:10

Vigneshwaran


1 Answers

This has already been corrected in the CSS3 Syntax module:

\/\*[^*]*\*+([^/][^*]*\*+)*\/   /* ignore comments */

Notice that the extraneous asterisk is gone, making this expression identical to what you have.

So it would seem that it was simply a mistake on their part while writing the grammar for CSS2. I'm digging the mailing list archives to see if there's any discussion there that could be relevant.

like image 102
BoltClock Avatar answered Oct 10 '22 08:10

BoltClock