I wrote a simple program to examine how IL works :
void Main()
{
int a=5;
int b=6;
if (a<b) Console.Write("333");
Console.ReadLine();
}
The IL :
IL_0000: ldc.i4.5
IL_0001: stloc.0
IL_0002: ldc.i4.6
IL_0003: stloc.1
IL_0004: ldloc.0
IL_0005: ldloc.1
IL_0006: bge.s IL_0012
IL_0008: ldstr "333"
IL_000D: call System.Console.Write
IL_0012: call System.Console.ReadLine
I'm trying to understand the implemented efficiency :
at line #1 (IL code) it pushes the value 5 onto the stack ( 4 bytes which is int32)
at line #2 (IL code) it POPs from the stack into a local variable.
same goes for the next 2 lines.
and then , it loads those local variables onto the stack and THEN it evaluate bge.s
.
Question #1
Why does he loads the local variables to the stack ? the values has already been in the stack. but he poped them in order to put them in a local variables . isn't it a waste ?
I mean , why the code couldn't be something like :
IL_0000: ldc.i4.5
IL_0001: ldc.i4.6
IL_0002: bge.s IL_0004
IL_0003: ldstr "333"
IL_0004: call System.Console.Write
IL_0005: call System.Console.ReadLine
my sample of code is just 5 lines of code. what about 50,000,000 lines of code ? there will be plenty of extra code emitted by IL
Question #2
Looking at the code address :
p.s. Im with Optimize flag on + release mode
I can answer the second question easily. The instructions are variable-length. For example the ldstr "333"
consists of the opcode for ldstr
(at address 8
) followed by the data representing the string (a reference to the string in the user string table).
Similarly with the call
statements following that - you need the call
opcode itself plus the information on the functions to call.
The reason the instructions for pushing small values like 4 or 6 onto the stack don't have extra data is because the values are encoded into the opcode itself.
See here for the instructions and encodings.
As to the first question, you may want to look at this blog entry by Eric Lippert, one of the C# developers, which states:
The /optimize flag does not change a huge amount of our emitting and generation logic. We try to always generate straightforward, verifiable code and then rely upon the jitter to do the heavy lifting of optimizations when it generates the real machine code.
Why does he loads the local variables to the stack? The values has already been in the stack. But he poped them in order to put them in a local variables. Isn't it a waste?
A waste of what? You have to remember that IL (usually) isn't executed as it is, it's compiled again by the JIT compiler, which performs most of the optimizations. One of the points of using an “intermediate language” is so that optimizations can be implemented in one place: the JIT compiler and each language (C#, VB.NET, F#, …) doesn't have to implement them all over again. This is explained by Eric Lippert in his article Why IL?
Where is the IL_0009 address? Isn't it supposed to be sequential?
Let's have a look at the specification of the ldstr
instruction (from ECMA-335):
III.4.16
ldstr
– load a literal stringFormat: 72 <T> […]
The
ldstr
instruction pushes a new string object representing the literal stored in the metadata as string (which is a string literal).
That reference to metadata above and the <T> mean that the byte 72
of the instruction is followed by a metadata token, which points to a table containing strings. How big is such token? From section III.1.9 of the same document:
Many CIL instructions are followed by a "metadata token". This is a 4-byte value, that specifies a row in a metadata table […]
So, in your case, the byte 72
of the instruction is at the address 0008 and the token (0x70000001 in this case, where the 0x70 byte represents the user strings table) is at addresses 0009 to 000C.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With