Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does decompiled code contain a foreach-loop?

Tags:

c#

decompiling

I've implemented a foreach-loop and a while-loop that should create pretty much the same IL code.

The IL code (generated with compiler version 12.0.40629 for C#5) indeed is almost identical (with the natural exception of some numbers and so), but decompiler were able to reproduce the initial code.

What's the key difference that allows a decompiler to tell that the former code block is a foreach-loop while the latter one represents a while-loop?

The decompiled code that I provide below is generated with the latest version (as of today) of ILSpy (2.3.1.1855), but I also used JustDecompile, .NET Reflector, and dotPeek — with no difference. I didn't configure anything, I just the tools as they are installed.

Original code:

using System;
using System.Collections.Generic;

namespace ForeachVersusWhile
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var x = new List<int> {1, 2};
            foreach (var item in x)
            {
                Console.WriteLine(item);
            }

            using (var enumerator = x.GetEnumerator())
            {
                while (enumerator.MoveNext())
                {
                    Console.WriteLine(enumerator.Current);
                }
            }
        }
    }
}

Decompiled code:

List<int> x = new List<int>
{
    1,
    2
};
foreach (int item in x)
{
    Console.WriteLine(item);
}
using (List<int>.Enumerator enumerator = x.GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        Console.WriteLine(enumerator.Current);
    }
}

IL Code (loops only):

[...]
IL_0016: ldloc.0
IL_0017: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<int32>::GetEnumerator()
IL_001c: stloc.s CS$5$0000
.try
{
    IL_001e: br.s IL_002e
    // loop start (head: IL_002e)
        IL_0020: ldloca.s CS$5$0000
        IL_0022: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
        IL_0027: stloc.1
        IL_0028: ldloc.1
        IL_0029: call void [mscorlib]System.Console::WriteLine(int32)

        IL_002e: ldloca.s CS$5$0000
        IL_0030: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()
        IL_0035: brtrue.s IL_0020
    // end loop

    IL_0037: leave.s IL_0047
} // end .try
finally
{
    IL_0039: ldloca.s CS$5$0000
    IL_003b: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>
    IL_0041: callvirt instance void [mscorlib]System.IDisposable::Dispose()
    IL_0046: endfinally
} // end handler

IL_0047: ldloc.0
IL_0048: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<int32>::GetEnumerator()
IL_004d: stloc.2
.try
{
    IL_004e: br.s IL_005c
    // loop start (head: IL_005c)
        IL_0050: ldloca.s enumerator
        IL_0052: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
        IL_0057: call void [mscorlib]System.Console::WriteLine(int32)

        IL_005c: ldloca.s enumerator
        IL_005e: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()
        IL_0063: brtrue.s IL_0050
    // end loop

    IL_0065: leave.s IL_0075
} // end .try
finally
{
    IL_0067: ldloca.s enumerator
    IL_0069: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>
    IL_006f: callvirt instance void [mscorlib]System.IDisposable::Dispose()
    IL_0074: endfinally
} // end handler

Background to the question:

I've read an article where they took a look at what C# code gets compiled to. In the first step they looked at a simple example: the foreach-loop.

Backed up by MSDN, a foreach loop is supposed to "hide the complexity of the enumerators". IL code doesn't know anything of a foreach-loop. So, my understanding is that, under the hood, the IL code of a foreach-loop equals a while-loop using IEnumerator.MoveNext.

Because the IL code doesn't represent a foreach-loop, a decompiler can hardly tell that a foreach-loop was used. That rose a couple of questions where people wondered why they saw a while-loop when they decompiled their own code. Here's one example.

I wanted to see that myself and wrote a small program with a foreach-loop and compiled it. Then I used a Decompiler to see what the code looks like. I wasn't expecting a foreach-loop, but was surprised when I actually got one.

The pure IL code, naturally, contained calls of IEnumerator.MoveNext etc.

I suppose I'm doing something wrong and hence enabling tools to access more information and, in consequence, correctly telling that I were using a foreach-loop. So, why am I seeing a foreach-loop instead of a while-loop using IEnumerator.MoveNext?

like image 412
Em1 Avatar asked Jan 08 '16 13:01

Em1


1 Answers

Here's the code I compiled, which made it slightly easier to look at the differences:

using System;
using System.Collections.Generic;

class Test
{
    static void Main() {} // Just to make it simpler to compile

    public static void ForEach(List<int> x)
    {        
        foreach (var item in x)
        {
            Console.WriteLine(item);
        }
    }

    public static void While(List<int> x)
    {
        using (var enumerator = x.GetEnumerator())
        {
            while (enumerator.MoveNext())
            {
                Console.WriteLine(enumerator.Current);
            }
        }
    }
}

I'm using Roslyn, via VS2015 update 1 - version 1.1.0.51109.

Compiling with csc /o- /debug- Test.cs

In this case, Reflector 9.0.1.318 can tell the difference... and so can I. The locals for the foreach loop are:

.locals init (valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> V_0,
       int32 V_1)

But the locals for the while loop are:

.locals init (valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> V_0,
       bool V_1)

In the while loop, there's a stloc.1/ldloc.1 pair with the result of MoveNext(), but not with the result of Current... whereas in the foreach it's the other way round.

Compiling with csc /o+ /debug- Test.cs

In this case, Reflector showed a while loop in both cases, and the IL really was identical. There's no stloc.1/ldloc.1 pair in either loop.

Your IL

Looking at the IL that your compilation has come up with - again, there's the stloc.1/ldloc.1 pair for the Current property in the foreach loop.

Hand-crafted IL

I took the IL from the "can't tell the difference version" and just changed the .locals part and added stloc.1/ldloc.1 into the mix, and bingo - Reflector thought it was a foreach loop again.

So basically, while I don't know about other decompilers, it looks like Reflector uses what you do with the Current call as a signal.

Validation

I changed the While method to:

public static void While(List<int> x)
{        
    using (var enumerator = x.GetEnumerator())
    {
        while (enumerator.MoveNext())
        {
            int item = enumerator.Current;
            Console.WriteLine(item);
        }
    }
}

Now even with csc /o- /debug+, Reflector thinks the while loop is actually a foreach loop.

like image 60
Jon Skeet Avatar answered Oct 18 '22 00:10

Jon Skeet