Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will the C# compiler optimize calls to a same method inside a loop?

Tags:

performance

c#

Consider the following code:

enum Test {
    OptionOne,
    OptionTwo
}

List<string> testValues = new List<string> { ... } // A huge collection of strings

foreach(var val in testValues)
{
    if(val == Test.OptionOne.ToString()) // **Here**
    {
        // Do something
    }
}

Will the compiler optimize the calls to Test.OptionOne.ToString() or will it call it for each item in the testValues collection?

like image 288
Rat Salad Avatar asked Feb 02 '16 17:02

Rat Salad


2 Answers

Your question is one of loop-invariants analysis - can the compiler detect some expression in a loop that does not depend on the state of the loop for its evaluation, and has no side effects?

There's good reason to hope that the compiler could satisfy both propositions - the compiler could be smart enough to know that calling ToString() on an enumeration doesn't change, ever; and that calling ToString() on an enumeration doesn't have any appreciable side effects.

There might be reasons why the compiler would actively decide not to hoist the invariant - perhaps invoking the function is somehow faster than storing an extra variable on the stack.

The question boils down to whether or not it does.

I compiled the following program using VS2012 targeting .Net 4.6, and compiled with optimizations enabled. It appears that the compiler did not choose to hoist the invariant out of the loop:

    public static void Main()
    {
        for( int i = 0; i < 10; i++ )
        {
            Console.Out.WriteLine( i );
            Console.Out.WriteLine( Test.Option1.ToString() );
        }
    }

    public enum Test
    {
        Option1,
        Option2,
        Option3
    }

Here's the raw IL from the program that I obtained using ILSpy 2.3.1. Note the call to ToString(), right in the middle of the loop.

.method public hidebysig static 
    void Main () cil managed 
{
    .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = (
        01 00 00 00
    )
    // Method begins at RVA 0x2050
    // Code size 46 (0x2e)
    .maxstack 2
    .entrypoint
    .locals init (
        [0] int32 i
    )

    IL_0000: ldc.i4.0
    IL_0001: stloc.0
    IL_0002: br.s IL_0028
    // loop start (head: IL_0028)
        IL_0004: call class [mscorlib]System.IO.TextWriter [mscorlib]System.Console::get_Out()
        IL_0009: ldloc.0
        IL_000a: callvirt instance void [mscorlib]System.IO.TextWriter::WriteLine(int32)
        IL_000f: call class [mscorlib]System.IO.TextWriter [mscorlib]System.Console::get_Out()
        IL_0014: ldc.i4.0
        IL_0015: box TestProject.Program/Test
--->    IL_001a: callvirt instance string [mscorlib]System.Object::ToString()
        IL_001f: callvirt instance void [mscorlib]System.IO.TextWriter::WriteLine(string)
        IL_0024: ldloc.0
        IL_0025: ldc.i4.1
        IL_0026: add
        IL_0027: stloc.0

        IL_0028: ldloc.0
        IL_0029: ldc.i4.s 10
        IL_002b: blt.s IL_0004
    // end loop

    IL_002d: ret
} // end of method Program::Main

I also got curious to see if the runtime JITer would hoist the invariant, but it too seems not to. I changed the code to the following, to make the assembly cleaner:

    public static void Main()
    {
        TextWriter cons = Console.Out;
        for( int i = 0; i < 10; i++ )
        {
            cons.WriteLine( i );
            cons.WriteLine( Test.Option1.ToString() );
        }
    }

And then used VS's debugger to get the assembly, being careful to make sure VS allowed the JITer to optimize. It still did not hoist the call to ToString():

            TextWriter cons = Console.Out;
00000000  push        rdi 
00000001  push        rsi 
00000002  sub         rsp,28h 
00000006  call        0000000050D76460 
0000000b  mov         rsi,rax 
            for( int i = 0; i < 10; i++ )
0000000e  xor         edi,edi 
            {
                cons.WriteLine( i );
00000010  mov         rcx,rsi 
00000013  mov         edx,edi 
00000015  mov         rax,qword ptr [rsi] 
00000018  mov         rax,qword ptr [rax+60h] 
0000001c  call        qword ptr [rax+28h] 
                cons.WriteLine( Test.Option1.ToString() );
0000001f  mov         rcx,7FE90116770h 
00000029  call        000000005F6302D0 
0000002e  mov         rcx,rsi 
00000031  xor         ecx,ecx 
00000033  mov         dword ptr [rax+8],ecx 
00000036  mov         rcx,rax 
00000039  mov         rax,qword ptr [rax] 
0000003c  mov         rax,qword ptr [rax+40h] 
00000040  call        qword ptr [rax]            <---- call System.Enum.ToString()
00000042  mov         rdx,rax 
00000045  mov         rcx,rsi 
00000048  mov         rax,qword ptr [rsi] 
0000004b  mov         rax,qword ptr [rax+68h] 
0000004f  call        qword ptr [rax+20h] 
            for( int i = 0; i < 10; i++ )
00000052  inc         edi 
00000054  cmp         edi,0Ah 
00000057  jl          0000000000000010 
00000059  add         rsp,28h 
            }
        }
like image 96
antiduh Avatar answered Nov 01 '22 03:11

antiduh


No, but you can greatly reduce the complexity by doing something like this:

using System.Linq;

var testValues = new List<string> { ... }; // A huge collection of strings
var testDict = testValue.ToDictionary(elem => elem, elem => true);

var needle = Test.OptionOne.ToString();
if (testDict.ContainsKey(needle))
{
   // do something
}

Dictionary value retrieval has a complexity of O(1).

I think an alternative is HashSet, as you have only keys. An interesting question and answers regarding building a HashSet from a list can be found here.

[edit] Following Scott's comment, I will include option using Hashset (also in O(1)):

var testHash = new HashSet<string>(testValues);
if (testHash.Contains(needle))
{
   // do something
}

Based on btlog's correct observation, example code will fail for duplicates. It can be circumvented by either:

  • build directly the HashSet when fetching data (avoid the list in the first place)
  • obtain a second list having distinct values by using Distinct

However, the second approach raises the complexity to O(N)

like image 26
Alexei - check Codidact Avatar answered Nov 01 '22 04:11

Alexei - check Codidact