Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Variables ending with "1" have the "1" removed within ILSpy. Why?

Tags:

c#

ilspy

In an effort to explore how the C# compiler optimizes code, I've created a simple test application. With each test change, I've compiled the application and then opened the binary in ILSpy.

I just noticed something that, to me, is weird. Obviously this is intentional, however, I can't think of a good reason why the compiler would do this.

Consider the following code:

static void Main(string[] args)
{
    int test_1 = 1;
    int test_2 = 0;
    int test_3 = 0;

    if (test_1 == 1) Console.Write(1);
    else if (test_2 == 1) Console.Write(1);
    else if (test_3 == 1) Console.Write(2);
    else Console.Write("x");
}

Pointless code, but I had written this to see how ILSpy would interpret the if statements.

However, when I compiled/decompiled this code, I did notice something that had me scratching my head. My first variable test_1 was optimized to test_! Is there a good reason why the C# compiler would do this?

For full inspection this is the output of Main() that I'm seeing in ILSpy.

private static void Main(string[] args)
{
    int test_ = 1; //Where did the "1" go at the end of the variable name???
    int test_2 = 0;
    int test_3 = 0;
    if (test_ == 1)
    {
        Console.Write(1);
    }
    else
    {
        if (test_2 == 1)
        {
            Console.Write(1);
        }
        else
        {
            if (test_3 == 1)
            {
                Console.Write(2);
            }
            else
            {
                Console.Write("x");
            }
        }
    }
}

UPDATE

Apparently after inspecting the IL, this is an issue with ILSpy, not the C# compiler. Eugene Podskal has given a good answer to my initial comments and observations. However, I am interested in knowing if this is rather a bug within ILSpy or if this is intentional functionality.

like image 791
RLH Avatar asked Sep 05 '14 18:09

RLH


1 Answers

It is probably some problem with decompiler. Because IL is correct on .NET 4.5 VS2013:

.entrypoint
  // Code size       79 (0x4f)
  .maxstack  2
  .locals init ([0] int32 test_1,
           [1] int32 test_2,
           [2] int32 test_3,
           [3] bool CS$4$0000)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0

edit: it uses data from .pdb file(see this answer) to get correct name variables. Without pdb it will have variables in form V_0, V_1, V_2.

EDIT:

Variable name mangles in the file NameVariables.cs in method:

public string GetAlternativeName(string oldVariableName)
{
    if (oldVariableName.Length == 1 && oldVariableName[0] >= 'i' && oldVariableName[0] <= maxLoopVariableName) {
        for (char c = 'i'; c <= maxLoopVariableName; c++) {
            if (!typeNames.ContainsKey(c.ToString())) {
                typeNames.Add(c.ToString(), 1);
                return c.ToString();
            }
        }
    }

    int number;
    string nameWithoutDigits = SplitName(oldVariableName, out number);

    if (!typeNames.ContainsKey(nameWithoutDigits)) {
        typeNames.Add(nameWithoutDigits, number - 1);
    }

    int count = ++typeNames[nameWithoutDigits];

    if (count != 1) {
        return nameWithoutDigits + count.ToString();
    } else {
        return nameWithoutDigits;
    }
}

NameVariables class uses this.typeNames dictionary to store names of variables without ending number (such variables mean something special to ILSpy, or perhaps even to IL, but I actually doubt it) associated with counter of their appearances in the method to decompile.

It means that all variables (test_1, test_2, test_3) will end in one slot ("test_") and for the first one count var will be one, resulting in execution:

else {
    return nameWithoutDigits;
}

where nameWithoutDigits is test_

EDIT

First, thanks @HansPassant and his answer for pointing the fault in this post.

So, the source of the problem:

ILSpy is as smart as ildasm, because it also uses .pdb data (or how else does it get test_1, test_2 names at all). But its inner workings are optimized for use with assemblies without any debug related info, hence its optimizations related to dealing with V_0, V_1, V_2 variables works inconsistently with the wealth of metadata from .pdb file.

As I understand, the culprit is an optimization to remove _0 from lone variables.

Fixing it will probably require propagating of the fact of .pdb data usage into the variable name generations code.

like image 85
Eugene Podskal Avatar answered Sep 22 '22 05:09

Eugene Podskal