Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overhead of Iterating T[] cast to IList<T>

I've noticed a performance hit of iterating over a primitive collection (T[]) that has been cast to a generic interface collection (IList or IEnumberable).

For example:

    private static int Sum(int[] array)
    {
        int sum = 0;

        foreach (int i in array)
            sum += i;

        return sum;
    }

The above code executes significantly faster than the code below, where the parameter is changed to type IList (or IEnumerable):

    private static int Sum(IList<int> array)
    {
        int sum = 0;

        foreach (int i in array)
            sum += i;

        return sum;
    }

The performance hit still occurs if the object passed is a primitive array, and if I try changing the loop to a for loop instead of a foreach loop.

I can get around the performance hit by coding it like such:

    private static int Sum(IList<int> array)
    {
        int sum = 0;

        if( array is int[] )
            foreach (int i in (int[])array)
                sum += i;
        else
            foreach (int i in array)
                sum += i;

        return sum;
    }

Is there a more elegant way of solving this issue? Thank you for your time.

Edit: My benchmark code:

    static void Main(string[] args)
    {
        int[] values = Enumerable.Range(0, 10000000).ToArray<int>();
        Stopwatch sw = new Stopwatch();

        sw.Start();
        Sum(values);
        //Sum((IList<int>)values);
        sw.Stop();

        Console.WriteLine("Elasped: {0} ms", sw.ElapsedMilliseconds);
        Console.Read();
    }
like image 874
Generic Comrade Avatar asked Nov 26 '11 17:11

Generic Comrade


2 Answers

Welcome to optimization. Things aren't always obvious here!

Basically, as you've found, when the compiler detects that certain types of safety constraints are proven to hold, it can issue enormously more efficient code when doing full optimization. Here (as MagnatLU shows) we see that knowing you've got an array allows all sorts of assumptions to be made about the size being fixed, and it allows memory to be accessed directly (which is also maximally efficient in how it integrates with the CPU's memory prefetch code, for bonus speed). When the compiler doesn't have the proof that it can generate super-fast code, it plays it safe. (This is the right thing to do.)

As a general comment, your workaround code is pretty simple when it comes to coding for optimization (when making the code super-readable and maintainable isn't always the first consideration). I don't really see how you could better it without making your class's API more complex (not a win!). Moreover, just adding a comment inside the body to say why you've done this would solve the maintenance issue; this is in fact one of the best uses for (non-doc) comments in the code in the first place. Given that the use case is large arrays (i.e., that it's reasonable to optimize at all in the first place) I'd say you have a great solution right there.

like image 31
Donal Fellows Avatar answered Nov 19 '22 12:11

Donal Fellows


Your best bet is to create overload for Sum with int[] as argument if this method is performance-critical. CLR's JIT can detect foreach-style iteration over array and can skip range checking and address each element directly. Each iteration of loop takes 3-5 instructions on x86, with only one memory lookup.

When using IList, JIT does not have knowledge about underlying collection's structure and ends up using IEnumerator<int>. Each iteration of loop uses two interface invocation - one for Current, one for MoveNext (2-3 memory lookups and a call for each of those). This most likely ends up with ~20 quite expensive instructions and there is not much you can do about it.

Edit: If you are curious about actual machine code emitted by JIT from Release build without debugger attached, here it is.

Array version

            int s = 0;
00000000  push        ebp  
00000001  mov         ebp,esp 
00000003  push        edi  
00000004  push        esi  
00000005  xor         esi,esi 
            foreach (int i in arg)
00000007  xor         edx,edx 
00000009  mov         edi,dword ptr [ecx+4] 
0000000c  test        edi,edi 
0000000e  jle         0000001B 
00000010  mov         eax,dword ptr [ecx+edx*4+8] 
                s += i;
00000014  add         esi,eax 
00000016  inc         edx  
            foreach (int i in arg)
00000017  cmp         edi,edx 
00000019  jg          00000010 

IEnumerable version

            int s = 0;
00000000  push        ebp  
00000001  mov         ebp,esp 
00000003  push        edi  
00000004  push        esi  
00000005  push        ebx  
00000006  sub         esp,1Ch 
00000009  mov         esi,ecx 
0000000b  lea         edi,[ebp-28h] 
0000000e  mov         ecx,6 
00000013  xor         eax,eax 
00000015  rep stos    dword ptr es:[edi] 
00000017  mov         ecx,esi 
00000019  xor         eax,eax 
0000001b  mov         dword ptr [ebp-18h],eax 
0000001e  xor         edx,edx 
00000020  mov         dword ptr [ebp-24h],edx 
            foreach (int i in arg)
00000023  call        dword ptr ds:[009E0010h] 
00000029  mov         dword ptr [ebp-28h],eax 
0000002c  mov         ecx,dword ptr [ebp-28h] 
0000002f  call        dword ptr ds:[009E0090h] 
00000035  test        eax,eax 
00000037  je          00000052 
00000039  mov         ecx,dword ptr [ebp-28h] 
0000003c  call        dword ptr ds:[009E0110h] 
                s += i;
00000042  add         dword ptr [ebp-24h],eax 
            foreach (int i in arg)
00000045  mov         ecx,dword ptr [ebp-28h] 
00000048  call        dword ptr ds:[009E0090h] 
0000004e  test        eax,eax 
00000050  jne         00000039 
00000052  mov         dword ptr [ebp-1Ch],0 
00000059  mov         dword ptr [ebp-18h],0FCh 
00000060  push        0F403BCh 
00000065  jmp         00000067 
00000067  cmp         dword ptr [ebp-28h],0 
0000006b  je          00000076 
0000006d  mov         ecx,dword ptr [ebp-28h] 
00000070  call        dword ptr ds:[009E0190h] 
like image 91
MagnatLU Avatar answered Nov 19 '22 11:11

MagnatLU