In C# should you have code like:
public static string importantRegex = "magic!"; public void F1(){ //code if(Regex.IsMatch(importantRegex)){ //codez in here. } //more code } public void main(){ F1(); /* some stuff happens...... */ F1(); }
or should you persist an instance of a Regex containing the important pattern? What is the cost of using Regex.IsMatch? I imagine there is an NFA created in each Regex intance. From what I understand this NFA creation is non trivial.
IsMatch(ReadOnlySpan<Char>, String, RegexOptions, TimeSpan)Indicates whether the specified regular expression finds a match in the specified input span, using the specified matching options and time-out interval.
In C#, Regular Expression is a pattern which is used to parse and check whether the given input text is matching with the given pattern or not. In C#, Regular Expressions are generally termed as C# Regex. The . Net Framework provides a regular expression engine that allows the pattern matching.
Microsoft . NET, which you can use with any . NET programming language such as C# (C sharp) or Visual Basic.NET, has solid support for regular expressions. . NET's regex flavor is very feature-rich.
The REGEXREPLACE( ) function uses a regular expression to find matching patterns in data, and replaces any matching values with a new string. standardizes spacing in character data by replacing one or more spaces between text characters with a single space.
In a rare departure from my typical egotism, I'm kind of reversing myself on this answer.
My original answer, preserved below, was based on an examination of version 1.1 of the .NET framework. This is pretty shameful, since .NET 2.0 had been out for over three years at the time of my answer, and it contained changes to the Regex
class that significantly affect the difference between the static and instance methods.
In .NET 2.0 (and 4.0), the static IsMatch
function is defined as follows:
public static bool IsMatch(string input, string pattern){ return new Regex(pattern, RegexOptions.None, true).IsMatch(input); }
The significant difference here is that little true
as the third argument. That corresponds to a parameter named "useCache". When that is true, then the parsed tree is retrieved from cached on the second and subsequent use.
This caching eats up most—but not all—of the performance difference between the static and instance methods. In my tests, the static IsMatch
method was still about 20% slower than the instance method, but that only amounted to about a half second increase when run 100 times over a set of 10,000 input strings (for a total of 1 million operations).
This 20% slowdown can still be significant in some scenarios. If you find yourself regexing hundreds of millions of strings, you'll probably want to take every step you can to make it more efficient. But I'd bet that 99% of the time, you're using a particular Regex no more than a handful of times, and the extra millisecond you lose to the static method won't be even close to noticeable.
Props to devgeezer, who pointed this out almost a year ago, although no one seemed to notice.
My old answer follows:
The static IsMatch
function is defined as follows:
public static bool IsMatch(string input, string pattern){ return new Regex(pattern).IsMatch(input); }
And, yes, initialization of a Regex
object is not trivial. You should use the static IsMatch
(or any of the other static Regex
functions) as a quick shortcut only for patterns that you will use only once. If you will reuse the pattern, it's worth it to reuse a Regex
object, too.
As to whether or not you should specify RegexOptions.Compiled
, as suggested by Jon Skeet, that's another story. The answer there is: it depends. For simple patterns or for patterns used only a handful of times, it may well be faster to use a non-compiled instance. You should definitely profile before deciding. The cost of compiling a regular expression object is quite large indeed, and may not be worth it.
Take, as an example, the following:
const int count = 10000; string pattern = "^[a-z]+[0-9]+$"; string input = "abc123"; Stopwatch sw = Stopwatch.StartNew(); for(int i = 0; i < count; i++) Regex.IsMatch(input, pattern); Console.WriteLine("static took {0} seconds.", sw.Elapsed.TotalSeconds); sw.Reset(); sw.Start(); Regex rx = new Regex(pattern); for(int i = 0; i < count; i++) rx.IsMatch(input); Console.WriteLine("instance took {0} seconds.", sw.Elapsed.TotalSeconds); sw.Reset(); sw.Start(); rx = new Regex(pattern, RegexOptions.Compiled); for(int i = 0; i < count; i++) rx.IsMatch(input); Console.WriteLine("compiled took {0} seconds.", sw.Elapsed.TotalSeconds);
At count = 10000
, as listed, the second output is fastest. Increase count
to 100000
, and the compiled version wins.
If you're going to reuse the regular expression multiple times, I'd create it with RegexOptions.Compiled
and cache it. There's no point in making the framework parse the regex pattern every time you want it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With