I run into this article:
Performance: Compiled vs. Interpreted Regular Expressions, I modified the sample code to compile 1000 Regex and then run each 500 times to take advantage of precompilation, however even in that case interpreted RegExes run 4 times faster!
This means Big difference was due to JIT, after solving JIT compiled regex in the the following code still performs a little bit slow and doesn't make sense to me but @Jim in the answers provided a much cleaner version which works as expected.RegexOptions.Compiled
option is completely useless, actually even worse, it's slower!
Can anyone explain why this is the case?
Code, taken & modified from the blog post:
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegExTester { class Program { static void Main(string[] args) { DateTime startTime = DateTime.Now; for (int i = 0; i < 1000; i++) { CheckForMatches("some random text with email address, [email protected]" + i.ToString()); } double msTaken = DateTime.Now.Subtract(startTime).TotalMilliseconds; Console.WriteLine("Full Run: " + msTaken); startTime = DateTime.Now; for (int i = 0; i < 1000; i++) { CheckForMatches("some random text with email address, [email protected]" + i.ToString()); } msTaken = DateTime.Now.Subtract(startTime).TotalMilliseconds; Console.WriteLine("Full Run: " + msTaken); Console.ReadLine(); } private static List<Regex> _expressions; private static object _SyncRoot = new object(); private static List<Regex> GetExpressions() { if (_expressions != null) return _expressions; lock (_SyncRoot) { if (_expressions == null) { DateTime startTime = DateTime.Now; List<Regex> tempExpressions = new List<Regex>(); string regExPattern = @"^[a-zA-Z0-9]+[a-zA-Z0-9._%-]*@{0}$"; for (int i = 0; i < 2000; i++) { tempExpressions.Add(new Regex( string.Format(regExPattern, Regex.Escape("domain" + i.ToString() + "." + (i % 3 == 0 ? ".com" : ".net"))), RegexOptions.IgnoreCase));// | RegexOptions.Compiled } _expressions = new List<Regex>(tempExpressions); DateTime endTime = DateTime.Now; double msTaken = endTime.Subtract(startTime).TotalMilliseconds; Console.WriteLine("Init:" + msTaken); } } return _expressions; } static List<Regex> expressions = GetExpressions(); private static void CheckForMatches(string text) { DateTime startTime = DateTime.Now; foreach (Regex e in expressions) { bool isMatch = e.IsMatch(text); } DateTime endTime = DateTime.Now; //double msTaken = endTime.Subtract(startTime).TotalMilliseconds; //Console.WriteLine("Run: " + msTaken); } } }
I created a much simpler test that will show you that compiled regular expressions are unquestionably faster than not compiled. Here, the compiled regular expression is 35% faster than the not compiled regular expression.
The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.
Being more specific with your regular expressions, even if they become much longer, can make a world of difference in performance. The fewer characters you scan to determine the match, the faster your regexes will be.
String operations will always be faster than regular expression operations.
Compiled regular expressions match faster when used as intended. As others have pointed out, the idea is to compile them once and use them many times. The construction and initialization time are amortized out over those many runs.
I created a much simpler test that will show you that compiled regular expressions are unquestionably faster than not compiled.
const int NumIterations = 1000; const string TestString = "some random text with email address, [email protected]"; const string Pattern = "^[a-zA-Z0-9]+[a-zA-Z0-9._%-]*@domain0\\.\\.com$"; private static Regex NormalRegex = new Regex(Pattern, RegexOptions.IgnoreCase); private static Regex CompiledRegex = new Regex(Pattern, RegexOptions.IgnoreCase | RegexOptions.Compiled); private static Regex DummyRegex = new Regex("^.$"); static void Main(string[] args) { var DoTest = new Action<string, Regex, int>((s, r, count) => { Console.Write("Testing {0} ... ", s); Stopwatch sw = Stopwatch.StartNew(); for (int i = 0; i < count; ++i) { bool isMatch = r.IsMatch(TestString + i.ToString()); } sw.Stop(); Console.WriteLine("{0:N0} ms", sw.ElapsedMilliseconds); }); // Make sure that DoTest is JITed DoTest("Dummy", DummyRegex, 1); DoTest("Normal first time", NormalRegex, 1); DoTest("Normal Regex", NormalRegex, NumIterations); DoTest("Compiled first time", CompiledRegex, 1); DoTest("Compiled", CompiledRegex, NumIterations); Console.WriteLine(); Console.Write("Done. Press Enter:"); Console.ReadLine(); }
Setting NumIterations
to 500 gives me this:
Testing Dummy ... 0 ms Testing Normal first time ... 0 ms Testing Normal Regex ... 1 ms Testing Compiled first time ... 13 ms Testing Compiled ... 1 ms
With 5 million iterations, I get:
Testing Dummy ... 0 ms Testing Normal first time ... 0 ms Testing Normal Regex ... 17,232 ms Testing Compiled first time ... 17 ms Testing Compiled ... 15,299 ms
Here you see that the compiled regular expression is at least 10% faster than the not compiled version.
It's interesting to note that if you remove the RegexOptions.IgnoreCase
from your regular expression, the results from 5 million iterations are even more striking:
Testing Dummy ... 0 ms Testing Normal first time ... 0 ms Testing Normal Regex ... 12,869 ms Testing Compiled first time ... 14 ms Testing Compiled ... 8,332 ms
Here, the compiled regular expression is 35% faster than the not compiled regular expression.
In my opinion, the blog post you reference is simply a flawed test.
http://www.codinghorror.com/blog/2005/03/to-compile-or-not-to-compile.html
Compiled helps only if you instantiate it once and re-use it multiple times. If you're creating a compiled regex in the for loop then it obviously will perform worse. Can you show us your sample code?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With