I want to match anything up to a specific word (e.g., the closing comment in C */
), however, due to performance reasons I don't want to make use of non-greedy operators.
E.g., to match C comments: /\*.*?\*/
is too slow for my files. Is there any possibility to improve performance?
Sure, use unrolling-the-loop technique:
/\*[^*]*(?:\*(?!/)[^*]*)*\*/
See regex demo
The unrolling the loop technique is based on the hypothesis that in most case, you kown in a repeteated alternation, which case should be the most usual and which one is exceptional. We will called the first one, the normal case and the second one, the special case. The general syntax of the unrolling the loop technique could then be written as:
normal* ( special normal* )*
Which could means something like, match the normal case, if you find a special case, matched it than match the normal case again. You notice that part of this syntax could potentialy lead to a super-linear match. To avoid a neverending match to append, the following rules shoud be carefully applied:
- the start of the special case and the normal case must be mutually exclusive
- special must always match at least one character
- the special expression must be atomic: be careful to the fact that
( special normal* )*
could be reduced to(special)*
, which if special isspecial*
, this became similar to(a*)*
which is a undeterministic expression.
C# pattern declaration (using a verbatim string literal):
var pattern = @"/\*[^*]*(?:\*(?!/)[^*]*)*\*/";
The regex breakdown:
/\*
- literal /*
[^*]*
- 0 or more character other than *
(?:\*(?!/)[^*]*)*
- 0 or more sequences of...
\*(?!/)
- a literal *
not followed by /
[^*]*
- 0 or more character other than *
\*/
- literal */
Here is a graph showing how efficient the 3 potentially identical regexps are (tested at regexhero.net*):
* Tested against /* Comment
* Typical
* Comment
*/
Try this:
/\*(?:[^*]|\*(?!/))*\*/
I don't know whether it is faster than stribizhev's answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With