Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are end of string regex's optimized in .NET?

Tags:

.net

regex

Aside: Ok, I know I shouldn't be picking apart HTML like this with a regex, but its the simplest for what I need.

I have this regex:

Regex BodyEndTagRegex = new Regex("</body>(.*)$", RegexOptions.Compiled |
    RegexOptions.IgnoreCase | RegexOptions.Multiline);

Notice how I'm looking for the end of the string with $.

Are .NET's regular expressions optimized so that it doesn't have to scan the entire string? If not, how can I optimize it to start at the end?

like image 409
Daniel A. White Avatar asked Sep 23 '11 12:09

Daniel A. White


People also ask

Is regex fast in C#?

It depends. Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including: How many times you parse the regex.

Is regex matching fast?

(but is slow in Java, Perl, PHP, Python, Ruby, ...)

Is regex fast or slow?

The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.

Which is faster regex or split?

split is faster, but complex separators which might involve look ahead, Regex is only option.


1 Answers

You can control it itself by specifying Right-to-Left Mode option, but regex engine does not optimize it itself automatically until you do it yourself by specifying an option:

I believe key point is:

By default, the regular expression engine searches from left to right.

You can reverse the search direction by using the RegexOptions.RightToLeft option. The search automatically begins at the last character position of the string. For pattern-matching methods that include a starting position parameter, such as Regex.Match(String, Int32), the starting position is the index of the rightmost character position at which the search is to begin.

Important:

The RegexOptions.RightToLeft option changes the search direction only; it does not interpret the regular expression pattern from right to left

like image 163
sll Avatar answered Sep 20 '22 14:09

sll