I have seen a lot of questions about whether to declare variables inside or outside a for
loop scope. This is discussed at length, for example here, here, and here. The answer is that there is absolutely no performance difference (same IL), but for clarity, declaring variables in the tightest scope is preferred.
I was curious about a slightly different situation:
int i;
for (i = 0; i < 10; i++) {
Console.WriteLine(i);
}
for (i = 0; i < 10; i++) {
Console.WriteLine(i);
}
versus
for (int i = 0; i < 10; i++) {
Console.WriteLine(i);
}
for (int i = 0; i < 10; i++) {
Console.WriteLine(i);
}
I expected both methods to compile to the same IL in Release mode. However, this is not the case. I'll spare you the full IL and just point out the difference. The first method has one local:
.locals init (
[0] int32 i
)
while the second has just two locals, one for each for
loop counter:
.locals init (
[0] int32 i,
[1] int32 i
)
So there is a difference between these two which is not optimized away, which is surprising to me.
Why am I seeing this, and is there actually a performance difference between the two methods?
To answer your question, you've actually declared one local variable in the first case, and two in the second. The C# compiler apparently does not reuse the local variables even though I think it would be permitted to do so. My guess is that this is just not a performance gain that is worth writing a complex analysis to handle and might not even be useful if the JIT is smart enough to handle it anyway. However, the optimization you are expecting to see is done, just not at the IL level. It is done by the JIT compiler in the emitted machine code.
This is a simple enough case where inspecting the emitted machine code is actually informative. The summary is that these two methods will JIT compile to the same machine code (x86 shown below, but x64 machine code is the same as well) and thus there is no performance gain from using fewer local variables.
A quick note on conditions, I took both of these fragments and put them into different methods. Then I looked at the disassembly in Visual Studio 2015, with a .NET 4.6.1 runtime, x86 Release build (i.e. optimizations on) and attaching the debugger after the JIT has compiled the methods (at least on invocation without the debugger attached). I disabled method inlining to keep things consistent between both methods. To view the disassembly, place a break point in the desired method, attach, go to Debug > Windows > Disassembly. Hit F5 to run to the break point.
Without further ado, the first method disassembles to
for (i = 0; i < 10; i++)
010204A2 in al,dx
010204A3 push esi
010204A4 xor esi,esi
{
Console.WriteLine(i);
010204A6 mov ecx,esi
010204A8 call 71686C0C
for (i = 0; i < 10; i++)
010204AD inc esi
010204AE cmp esi,0Ah
010204B1 jl 010204A6
}
for (i = 0; i < 10; i++)
010204B3 xor esi,esi
{
Console.WriteLine(i);
010204B5 mov ecx,esi
010204B7 call 71686C0C
for (i = 0; i < 10; i++)
010204BC inc esi
010204BD cmp esi,0Ah
010204C0 jl 010204B5
010204C2 pop esi
010204C3 pop ebp
010204C4 ret
The second method disassembles to
for (int i = 0; i < 10; i++)
010204DA in al,dx
010204DB push esi
010204DC xor esi,esi
{
Console.WriteLine(i);
010204DE mov ecx,esi
010204E0 call 71686C0C
for (int i = 0; i < 10; i++)
010204E5 inc esi
010204E6 cmp esi,0Ah
010204E9 jl 010204DE
}
for (int i = 0; i < 10; i++)
010204EB xor esi,esi
{
Console.WriteLine(i);
010204ED mov ecx,esi
010204EF call 71686C0C
for (int i = 0; i < 10; i++)
010204F4 inc esi
010204F5 cmp esi,0Ah
010204F8 jl 010204ED
010204FA pop esi
010204FB pop ebp
010204FC ret
As you can see, aside from different offsets for the appropriate jumps, the code is identical.
These methods are quite simple so the work of keeping track of the loop counter is done with the esi register.
It is left as an exercise for the reader to verify in x64.
As an addition to the existing answer, note that collapsing the two variables into one could actually hurt performance, depending on what information the JIT compiler is able to infer.
If the JIT compiler sees two variables with non-overlapping lifetimes, it is free to use the same location (usually a register) for both. But if the JIT compiler sees a single variable, it is required to use the same location. Or, more accurately, it is required to maintain the value of the variable for its whole lifetime.
In your specific case, that would mean that after the first loop ends and before the second loop starts, the compiler can't throw away the value of the variable and reuse the location for some other purpose.
But even with a single IL variable, it's not given that the JIT compiler actually sees it as a single variable. A smart compiler can see that when the code leaves the first loop, the variable is not going to be read again, before it's overwritten. So it can treat the single IL variable as two, and throw away the value between the loops.
To sum up:
The JIT compiler is either #2 or #3, so it makes sense to use two variables in IL.
Just to add few things to the detailed answer above. The C# compiler makes very few optimizations, like concatenating string literals ("a" + "b") and calculating constants. Therefore, it's quite pointless to look at the IL generated by the C# compiler for optimizations. Instead you should look at the assembler generated by the JIT compiler.
Also the build parameters can suppress JIT optimizations. So make sure that you set up a Release build mode and cleared "Suppress JIT optimization on module load" flag in VS debug options
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With