Consider the following code:
enum Test {
OptionOne,
OptionTwo
}
List<string> testValues = new List<string> { ... } // A huge collection of strings
foreach(var val in testValues)
{
if(val == Test.OptionOne.ToString()) // **Here**
{
// Do something
}
}
Will the compiler optimize the calls to Test.OptionOne.ToString()
or will it call it for each item in the testValues
collection?
Your question is one of loop-invariants analysis - can the compiler detect some expression in a loop that does not depend on the state of the loop for its evaluation, and has no side effects?
There's good reason to hope that the compiler could satisfy both propositions - the compiler could be smart enough to know that calling ToString()
on an enumeration doesn't change, ever; and that calling ToString()
on an enumeration doesn't have any appreciable side effects.
There might be reasons why the compiler would actively decide not to hoist the invariant - perhaps invoking the function is somehow faster than storing an extra variable on the stack.
The question boils down to whether or not it does.
I compiled the following program using VS2012 targeting .Net 4.6, and compiled with optimizations enabled. It appears that the compiler did not choose to hoist the invariant out of the loop:
public static void Main()
{
for( int i = 0; i < 10; i++ )
{
Console.Out.WriteLine( i );
Console.Out.WriteLine( Test.Option1.ToString() );
}
}
public enum Test
{
Option1,
Option2,
Option3
}
Here's the raw IL from the program that I obtained using ILSpy 2.3.1. Note the call to ToString()
, right in the middle of the loop.
.method public hidebysig static
void Main () cil managed
{
.custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = (
01 00 00 00
)
// Method begins at RVA 0x2050
// Code size 46 (0x2e)
.maxstack 2
.entrypoint
.locals init (
[0] int32 i
)
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: br.s IL_0028
// loop start (head: IL_0028)
IL_0004: call class [mscorlib]System.IO.TextWriter [mscorlib]System.Console::get_Out()
IL_0009: ldloc.0
IL_000a: callvirt instance void [mscorlib]System.IO.TextWriter::WriteLine(int32)
IL_000f: call class [mscorlib]System.IO.TextWriter [mscorlib]System.Console::get_Out()
IL_0014: ldc.i4.0
IL_0015: box TestProject.Program/Test
---> IL_001a: callvirt instance string [mscorlib]System.Object::ToString()
IL_001f: callvirt instance void [mscorlib]System.IO.TextWriter::WriteLine(string)
IL_0024: ldloc.0
IL_0025: ldc.i4.1
IL_0026: add
IL_0027: stloc.0
IL_0028: ldloc.0
IL_0029: ldc.i4.s 10
IL_002b: blt.s IL_0004
// end loop
IL_002d: ret
} // end of method Program::Main
I also got curious to see if the runtime JITer would hoist the invariant, but it too seems not to. I changed the code to the following, to make the assembly cleaner:
public static void Main()
{
TextWriter cons = Console.Out;
for( int i = 0; i < 10; i++ )
{
cons.WriteLine( i );
cons.WriteLine( Test.Option1.ToString() );
}
}
And then used VS's debugger to get the assembly, being careful to make sure VS allowed the JITer to optimize. It still did not hoist the call to ToString():
TextWriter cons = Console.Out;
00000000 push rdi
00000001 push rsi
00000002 sub rsp,28h
00000006 call 0000000050D76460
0000000b mov rsi,rax
for( int i = 0; i < 10; i++ )
0000000e xor edi,edi
{
cons.WriteLine( i );
00000010 mov rcx,rsi
00000013 mov edx,edi
00000015 mov rax,qword ptr [rsi]
00000018 mov rax,qword ptr [rax+60h]
0000001c call qword ptr [rax+28h]
cons.WriteLine( Test.Option1.ToString() );
0000001f mov rcx,7FE90116770h
00000029 call 000000005F6302D0
0000002e mov rcx,rsi
00000031 xor ecx,ecx
00000033 mov dword ptr [rax+8],ecx
00000036 mov rcx,rax
00000039 mov rax,qword ptr [rax]
0000003c mov rax,qword ptr [rax+40h]
00000040 call qword ptr [rax] <---- call System.Enum.ToString()
00000042 mov rdx,rax
00000045 mov rcx,rsi
00000048 mov rax,qword ptr [rsi]
0000004b mov rax,qword ptr [rax+68h]
0000004f call qword ptr [rax+20h]
for( int i = 0; i < 10; i++ )
00000052 inc edi
00000054 cmp edi,0Ah
00000057 jl 0000000000000010
00000059 add rsp,28h
}
}
No, but you can greatly reduce the complexity by doing something like this:
using System.Linq;
var testValues = new List<string> { ... }; // A huge collection of strings
var testDict = testValue.ToDictionary(elem => elem, elem => true);
var needle = Test.OptionOne.ToString();
if (testDict.ContainsKey(needle))
{
// do something
}
Dictionary value retrieval has a complexity of O(1).
I think an alternative is HashSet, as you have only keys. An interesting question and answers regarding building a HashSet
from a list can be found here.
[edit] Following Scott's comment, I will include option using Hashset (also in O(1)):
var testHash = new HashSet<string>(testValues);
if (testHash.Contains(needle))
{
// do something
}
Based on btlog
's correct observation, example code will fail for duplicates. It can be circumvented by either:
Distinct
However, the second approach raises the complexity to O(N)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With