In C# 5, the closure semantics of the foreach
statement (when the iteration variable is "captured" or "closed over" by anonymous functions) was famously changed (link to thread on that topic).
Question: Was it the intention to change this for arrays of pointer types also?
The reason why I ask is that the "expansion" of a foreach
statement has to be rewritten, for technical reasons (we cannot use the Current
property of the System.Collections.IEnumerator
since this property has declared type object
which is incompatible with a pointer type) as compared to foreach
over other collections. The relevant section in the C# Language Specification, "Pointer arrays", in version 5.0, says that:
foreach (V v in x) EMBEDDED-STATEMENT
is expanded to:
{
T[,,…,] a = x;
V v;
for (int i0 = a.GetLowerBound(0); i0 <= a.GetUpperBound(0); i0++)
for (int i1 = a.GetLowerBound(1); i1 <= a.GetUpperBound(1); i1++)
…
for (int in = a.GetLowerBound(N); iN <= a.GetUpperBound(n); iN++) {
v = (V)a.GetValue(i0,i1,…,iN);
EMBEDDED-STATEMENT
}
}
We note that the declaration V v;
is outside all the for
loops. So it would appear that the closure semantics are still the "old" C# 4 flavor, "loop variable is reused, loop variable is "outer" with respect to the loop".
To make it clear what I am talking about, consider this complete C# 5 program:
using System;
using System.Collections.Generic;
static class Program
{
unsafe static void Main()
{
char* zeroCharPointer = null;
char*[] arrayOfPointers =
{ zeroCharPointer, zeroCharPointer + 1, zeroCharPointer + 2, zeroCharPointer + 100, };
var list = new List<Action>();
// foreach through pointer array, capture each foreach variable 'pointer' in a lambda
foreach (var pointer in arrayOfPointers)
list.Add(() => Console.WriteLine("Pointer address is {0:X2}.", (long)pointer));
Console.WriteLine("List complete");
// invoke those delegates
foreach (var act in list)
act();
}
// Possible output:
//
// List complete
// Pointer address is 00.
// Pointer address is 02.
// Pointer address is 04.
// Pointer address is C8.
//
// Or:
//
// List complete
// Pointer address is C8.
// Pointer address is C8.
// Pointer address is C8.
// Pointer address is C8.
}
So what is the correct output of the above program?
I've contacted Mads Torgersen, the C# Language PM, and it seems they simply forgot to update this part of the specification. His exact answer was (I asked why the spec wasn't updated):
because I forgot! :-) I now have in latest draft, and submitted to ECMA. Thanks!
So it seems that the behavior of C#-5 is identical for pointer arrays as well, and it is why you're seeing the first output, which is the correct one.
I suppose that specification was just not updated in this part (about pointer arrays) to reflect that V variable goes to the inner scope too. If compile your example with C# 5 compiler and look at the output - it will look like in specification (with array access instead of GetValue as you correctly point in your comment), except V variable will be inside all for loops. And output will be 00-02-04-C8, but of course you know all that yourself :)
Long story short - of course I cannot tell if that was intention or not, but my guess is that it was intended to move variable to inner scope for all foreach loops, including pointer arrays, and the specification was just not updated to reflect that.
The following code is compiles (C# 5.0) to the given IL code (Comments in code):
.method private hidebysig static void Main() cil managed
{
.entrypoint
.maxstack 6
.locals init (
[0] char* chPtr,
[1] char*[] chPtrArray,
[2] class [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action> list,
[3] char*[] chPtrArray2,
[4] int32 num,
[5] class ConsoleTests.Program/<>c__DisplayClass0_0 class_,
[6] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action> enumerator,
[7] class [mscorlib]System.Action action)
L_0000: nop
L_0001: ldc.i4.0 //{{{{{
L_0002: conv.u //chPtr = null;
L_0003: stloc.0 //}}}}}
L_0004: ldc.i4.4 //{{{{{
L_0005: newarr char* //Creates a new char*[4]}}}}}
L_000a: dup //{{{{{
L_000b: ldc.i4.0 // Sets the first element in the new
L_000c: ldloc.0 // char*[] to chPtr.
L_000d: stelem.i //}}}}}
L_000e: dup //{{{{{
L_000f: ldc.i4.1 //
L_0010: ldloc.0 // Sets the second element of the
L_0011: ldc.i4.2 // char*[] to chPtr + 1
L_0012: add // (loads 2 instead of 1 because char is UTF-16)
L_0013: stelem.i //}}}}}
L_0014: dup //{{{{{
L_0015: ldc.i4.2 //
L_0016: ldloc.0 //
L_0017: ldc.i4.2 // Sets the third element of the
L_0018: conv.i // char*[] to chPtr + 2
L_0019: ldc.i4.2 // (loads 4 instead of 2 because char is UTF-16)
L_001a: mul //
L_001b: add //
L_001c: stelem.i //}}}}}
L_001d: dup //{{{{{
L_001e: ldc.i4.3 //
L_001f: ldloc.0 //
L_0020: ldc.i4.s 100 // Sets the third element of the
L_0022: conv.i // char*[] to chPtr + 100
L_0023: ldc.i4.2 // (loads 200 instead of 100 because char is UTF-16)
L_0024: mul //
L_0025: add //
L_0026: stelem.i // }}}}}
L_0027: stloc.1 // chPtrArray = the new array that we have just filled.
L_0028: newobj instance void [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::.ctor() //{{{{{
L_002d: stloc.2 // list = new List<Action>()
L_002e: nop //}}}}}
L_002f: ldloc.1 //{{{{{
L_0030: stloc.3 //chPtrArray2 = chPtrArray}}}}}
L_0031: ldc.i4.0 //for (int num = 0; num < 3; num++)
L_0032: stloc.s num //
L_0034: br.s L_0062 //<<<<< (for start)
L_0036: newobj instance void ConsoleTests.Program/<>c__DisplayClass0_0::.ctor() //{{{{{
L_003b: stloc.s class_ //class_ = new temporary compile-time class
L_003d: ldloc.s class_ //}}}}}
L_003f: ldloc.3 //{{{{{
L_0040: ldloc.s num //
L_0042: ldelem.i //
L_0043: stfld char* ConsoleTests.Program/<>c__DisplayClass0_0::pointer //class_.pointer = chPtrArray2[num]}}}}}
L_0048: ldloc.2 //{{{{{
L_0049: ldloc.s class_ //
L_004b: ldftn instance void ConsoleTests.Program/<>c__DisplayClass0_0::<Main>b__0() // list.Add(class_.<Main>b__0);
L_0051: newobj instance void [mscorlib]System.Action::.ctor(object, native int) // (Adds the temporary compile-time class action, which has the correct pointer since
L_0056: callvirt instance void [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::Add(!0) //it is a specific class instace for this iteration, to the list)}}}}}
L_005b: nop
L_005c: ldloc.s num //practically the end of the for
L_005e: ldc.i4.1 // (actually increasing num and comparing)
L_005f: add //
L_0060: stloc.s num //
L_0062: ldloc.s num //
L_0064: ldloc.3 //
L_0065: ldlen //
L_0066: conv.i4 //
L_0067: blt.s L_0036 //>>>>> (for complete)
L_0069: ldstr "List complete" //Printing and stuff.....
L_006e: call void [mscorlib]System.Console::WriteLine(string)
L_0073: nop
L_0074: nop
L_0075: ldloc.2
L_0076: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator`0<!0> [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::GetEnumerator()
L_007b: stloc.s enumerator
L_007d: br.s L_0090
L_007f: ldloca.s enumerator
L_0081: call instance !0 [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>::get_Current()
L_0086: stloc.s action
L_0088: ldloc.s action
L_008a: callvirt instance void [mscorlib]System.Action::Invoke()
L_008f: nop
L_0090: ldloca.s enumerator
L_0092: call instance bool [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>::MoveNext()
L_0097: brtrue.s L_007f
L_0099: leave.s L_00aa
L_009b: ldloca.s enumerator
L_009d: constrained. [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>
L_00a3: callvirt instance void [mscorlib]System.IDisposable::Dispose()
L_00a8: nop
L_00a9: endfinally
L_00aa: ret
.try L_007d to L_009b finally handler L_009b to L_00aa
}
As you can see, a class is generated in compile-time, called <>c__DisplayClass0_0
which contains your Action
and a value of char*
. The class looks like that:
[CompilerGenerated]
private sealed class <>c__DisplayClass0_0
{
// Fields
public unsafe char* pointer;
// Methods
internal unsafe void <Main>b__0()
{
Console.WriteLine("Pointer address is {0:X2}.", (long) ((ulong) this.pointer));
}
}
In the MSIL code we can see that the foreach
is compiled to the following for loop:
shallowCloneOfArray = arrayOfPointers;
for (int num = 0; num < arrayOfPointers.Length; num++)
{
<>c__DisplayClass0_0 temp = new <>c__DisplayClass0_0();
temp.pointer = shallowCloneOfArray[num];
list.Add(temp.<Main>b__0); //Adds the action to the list of actions
}
What it means it that the value of the pointer is actually copied when the loop is iterated and the delegates are created, so the value of pointer at the time is the one that will be printed (a.k.a: each action is from its own instance of <>c__DisplayClass0_0
and will receive its temporary cloned pointer).
As we just saw, the "reused variable"
from before the foreach
is the array itself, which means the the referenced pointers are not reused which means that if the specifications are as you are saying, than they are wrong since the specifications you attched suggest that the output should be 00 00 00 00
. And the result:
List complete
Pointer address is 00.
Pointer address is 02.
Pointer address is 04.
Pointer address is C8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With