I am using the following code to test CompileToAssembly performance against compiled regex but the results are not appropriate. Please let me know what am I missing. Thanks!!!
static readonly Regex regex = new Regex(@"(stats|pause\s?(all|\d+(\,\d+)*)|start\s?(all|\d+(\,\d+)*)|add\s?time\s?(all|\d+(\,\d+)*)(\s\d+)|c(?:hange)?\s?p(?:asskey)?|close)(.*)", RegexOptions.Compiled);
static readonly Regex reg = new Regex(@"(stats|pause\s?(all|\d+(\,\d+)*)|start\s?(all|\d+(\,\d+)*)|add\s?time\s?(all|\d+(\,\d+)*)(\s\d+)|c(?:hange)?\s?p(?:asskey)?|close)(.*)");
static readonly Regex level4 = new DuplicatedString();
static void Main()
{
const string str = "add time 243,3453,43543,543,534534,54534543,345345,4354354235,345435,34543534 6873brekgnfkjerkgiengklewrij";
const int itr = 1000000;
CompileToAssembly();
Match match;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < itr; i++)
{
match = regex.Match(str);
}
sw.Stop();
Console.WriteLine("RegexOptions.Compiled: {0}ms", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
for (int i = 0; i < itr; i++)
{
match = level4.Match(str);
}
sw.Stop();
Console.WriteLine("CompiledToAssembly: {0}ms", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
for (int i = 0; i < itr; i++)
{
match = reg.Match(str);
}
sw.Stop();
Console.WriteLine("Interpreted: {0}ms", sw.ElapsedMilliseconds);
Console.ReadLine();
}
public static void CompileToAssembly()
{
RegexCompilationInfo expr;
List<RegexCompilationInfo> compilationList = new List<RegexCompilationInfo>();
// Define regular expression to detect duplicate words
expr = new RegexCompilationInfo(@"(stats|pause\s?(all|\d+(\,\d+)*)|start\s?(all|\d+(\,\d+)*)|add\s?time\s?(all|\d+(\,\d+)*)(\s\d+)|c(?:hange)?\s?p(?:asskey)?|close)(.*)",
RegexOptions.Compiled,
"DuplicatedString",
"Utilities.RegularExpressions",
true);
// Add info object to list of objects
compilationList.Add(expr);
// Apply AssemblyTitle attribute to the new assembly
//
// Define the parameter(s) of the AssemblyTitle attribute's constructor
Type[] parameters = { typeof(string) };
// Define the assembly's title
object[] paramValues = { "General-purpose library of compiled regular expressions" };
// Get the ConstructorInfo object representing the attribute's constructor
ConstructorInfo ctor = typeof(System.Reflection.AssemblyTitleAttribute).GetConstructor(parameters);
// Create the CustomAttributeBuilder object array
CustomAttributeBuilder[] attBuilder = { new CustomAttributeBuilder(ctor, paramValues) };
// Generate assembly with compiled regular expressions
RegexCompilationInfo[] compilationArray = new RegexCompilationInfo[compilationList.Count];
AssemblyName assemName = new AssemblyName("RegexLib, Version=1.0.0.1001, Culture=neutral, PublicKeyToken=null");
compilationList.CopyTo(compilationArray);
Regex.CompileToAssembly(compilationArray, assemName, attBuilder);
}
following are the results:
RegexOptions.Compiled: 3908ms
CompiledToAssembly: 59349ms
Interpreted: 5653ms
Your code has a problem: static field initializers will run before static methods run. That means that level4
has already been assigned before Main()
runs. This means that the object referred to by level4
is not an instance of the class created in CompileToAssembly()
.
Note that the example code for Regex.CompileToAssembly
shows the compilation of the regex and its consumption in two different programs. The actual regex you're timing as "CompiledToAssembly" could therefore be a different regex that you compiled in an earlier test.
Another factor to consider: the overhead of loading an assembly into memory and jitting it to machine code might be significant enough that you need more than 1,000,000 iterations to see a benefit.
You are running under a debugger (Visual Studio). It will prevent JIT optimizations from happening when an assembly is loaded. Try running without debugger (ctrl-f5).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With