Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Closure semantics for foreach over arrays of pointer types

In C# 5, the closure semantics of the foreach statement (when the iteration variable is "captured" or "closed over" by anonymous functions) was famously changed (link to thread on that topic).

Question: Was it the intention to change this for arrays of pointer types also?

The reason why I ask is that the "expansion" of a foreach statement has to be rewritten, for technical reasons (we cannot use the Current property of the System.Collections.IEnumerator since this property has declared type object which is incompatible with a pointer type) as compared to foreach over other collections. The relevant section in the C# Language Specification, "Pointer arrays", in version 5.0, says that:

foreach (V v in x) EMBEDDED-STATEMENT

is expanded to:

{
  T[,,…,] a = x;
  V v;
  for (int i0 = a.GetLowerBound(0); i0 <= a.GetUpperBound(0); i0++)
  for (int i1 = a.GetLowerBound(1); i1 <= a.GetUpperBound(1); i1++)
  …
  for (int in = a.GetLowerBound(N); iN <= a.GetUpperBound(n); iN++) {
    v = (V)a.GetValue(i0,i1,…,iN);
    EMBEDDED-STATEMENT
  }
}

We note that the declaration V v; is outside all the for loops. So it would appear that the closure semantics are still the "old" C# 4 flavor, "loop variable is reused, loop variable is "outer" with respect to the loop".

To make it clear what I am talking about, consider this complete C# 5 program:

using System;
using System.Collections.Generic;

static class Program
{
  unsafe static void Main()
  {
    char* zeroCharPointer = null;
    char*[] arrayOfPointers =
      { zeroCharPointer, zeroCharPointer + 1, zeroCharPointer + 2, zeroCharPointer + 100, };

    var list = new List<Action>();

    // foreach through pointer array, capture each foreach variable 'pointer' in a lambda
    foreach (var pointer in arrayOfPointers)
      list.Add(() => Console.WriteLine("Pointer address is {0:X2}.", (long)pointer));

    Console.WriteLine("List complete");
    // invoke those delegates
    foreach (var act in list)
      act();
  }

  // Possible output:
  //
  // List complete
  // Pointer address is 00.
  // Pointer address is 02.
  // Pointer address is 04.
  // Pointer address is C8.
  //
  // Or:
  //
  // List complete
  // Pointer address is C8.
  // Pointer address is C8.
  // Pointer address is C8.
  // Pointer address is C8.
}

So what is the correct output of the above program?

like image 625
Jeppe Stig Nielsen Avatar asked Sep 17 '15 14:09

Jeppe Stig Nielsen


3 Answers

I've contacted Mads Torgersen, the C# Language PM, and it seems they simply forgot to update this part of the specification. His exact answer was (I asked why the spec wasn't updated):

because I forgot! :-) I now have in latest draft, and submitted to ECMA. Thanks!

So it seems that the behavior of C#-5 is identical for pointer arrays as well, and it is why you're seeing the first output, which is the correct one.

like image 180
Yuval Itzchakov Avatar answered Oct 05 '22 23:10

Yuval Itzchakov


I suppose that specification was just not updated in this part (about pointer arrays) to reflect that V variable goes to the inner scope too. If compile your example with C# 5 compiler and look at the output - it will look like in specification (with array access instead of GetValue as you correctly point in your comment), except V variable will be inside all for loops. And output will be 00-02-04-C8, but of course you know all that yourself :)

Long story short - of course I cannot tell if that was intention or not, but my guess is that it was intended to move variable to inner scope for all foreach loops, including pointer arrays, and the specification was just not updated to reflect that.

like image 41
Evk Avatar answered Oct 05 '22 23:10

Evk


The following code is compiles (C# 5.0) to the given IL code (Comments in code):

.method private hidebysig static void Main() cil managed
{
    .entrypoint
    .maxstack 6
    .locals init (
        [0] char* chPtr,
        [1] char*[] chPtrArray,
        [2] class [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action> list,
        [3] char*[] chPtrArray2,
        [4] int32 num,
        [5] class ConsoleTests.Program/<>c__DisplayClass0_0 class_,
        [6] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action> enumerator,
        [7] class [mscorlib]System.Action action)
    L_0000: nop 
    L_0001: ldc.i4.0 //{{{{{
    L_0002: conv.u  //chPtr = null;
    L_0003: stloc.0 //}}}}}
    L_0004: ldc.i4.4 //{{{{{
    L_0005: newarr char* //Creates a new char*[4]}}}}}
    L_000a: dup //{{{{{
    L_000b: ldc.i4.0 // Sets the first element in the new
    L_000c: ldloc.0 // char*[] to chPtr.
    L_000d: stelem.i //}}}}}
    L_000e: dup //{{{{{
    L_000f: ldc.i4.1 //
    L_0010: ldloc.0 // Sets the second element of the
    L_0011: ldc.i4.2 // char*[] to chPtr + 1 
    L_0012: add // (loads 2 instead of 1 because char is UTF-16)
    L_0013: stelem.i //}}}}}
    L_0014: dup //{{{{{
    L_0015: ldc.i4.2 // 
    L_0016: ldloc.0 //
    L_0017: ldc.i4.2 // Sets the third element of the
    L_0018: conv.i // char*[] to chPtr + 2
    L_0019: ldc.i4.2 // (loads 4 instead of 2 because char is UTF-16)
    L_001a: mul //
    L_001b: add //
    L_001c: stelem.i //}}}}}
    L_001d: dup //{{{{{
    L_001e: ldc.i4.3 //
    L_001f: ldloc.0 //
    L_0020: ldc.i4.s 100 // Sets the third element of the
    L_0022: conv.i // char*[] to chPtr + 100
    L_0023: ldc.i4.2 // (loads 200 instead of 100 because char is UTF-16)
    L_0024: mul //
    L_0025: add //
    L_0026: stelem.i // }}}}}
    L_0027: stloc.1 // chPtrArray = the new array that we have just filled.
    L_0028: newobj instance void [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::.ctor() //{{{{{
    L_002d: stloc.2 // list = new List<Action>()
    L_002e: nop //}}}}}
    L_002f: ldloc.1 //{{{{{
    L_0030: stloc.3 //chPtrArray2 = chPtrArray}}}}}
    L_0031: ldc.i4.0 //for (int num = 0; num < 3; num++)
    L_0032: stloc.s num //
    L_0034: br.s L_0062 //<<<<< (for start)
    L_0036: newobj instance void ConsoleTests.Program/<>c__DisplayClass0_0::.ctor() //{{{{{
    L_003b: stloc.s class_ //class_ = new temporary compile-time class
    L_003d: ldloc.s class_ //}}}}}
    L_003f: ldloc.3 //{{{{{
    L_0040: ldloc.s num //
    L_0042: ldelem.i //
    L_0043: stfld char* ConsoleTests.Program/<>c__DisplayClass0_0::pointer //class_.pointer = chPtrArray2[num]}}}}}
    L_0048: ldloc.2 //{{{{{
    L_0049: ldloc.s class_ //
    L_004b: ldftn instance void ConsoleTests.Program/<>c__DisplayClass0_0::<Main>b__0() // list.Add(class_.<Main>b__0);
    L_0051: newobj instance void [mscorlib]System.Action::.ctor(object, native int) // (Adds the temporary compile-time class action, which has the correct pointer since
    L_0056: callvirt instance void [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::Add(!0) //it is a specific class instace for this iteration, to the list)}}}}}
    L_005b: nop 
    L_005c: ldloc.s num //practically the end of the for
    L_005e: ldc.i4.1 // (actually increasing num and comparing)
    L_005f: add //
    L_0060: stloc.s num //
    L_0062: ldloc.s num //
    L_0064: ldloc.3 //
    L_0065: ldlen //
    L_0066: conv.i4 //
    L_0067: blt.s L_0036 //>>>>> (for complete)
    L_0069: ldstr "List complete" //Printing and stuff.....
    L_006e: call void [mscorlib]System.Console::WriteLine(string)
    L_0073: nop 
    L_0074: nop 
    L_0075: ldloc.2 
    L_0076: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator`0<!0> [mscorlib]System.Collections.Generic.List`1<class [mscorlib]System.Action>::GetEnumerator()
    L_007b: stloc.s enumerator
    L_007d: br.s L_0090
    L_007f: ldloca.s enumerator
    L_0081: call instance !0 [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>::get_Current()
    L_0086: stloc.s action
    L_0088: ldloc.s action
    L_008a: callvirt instance void [mscorlib]System.Action::Invoke()
    L_008f: nop 
    L_0090: ldloca.s enumerator
    L_0092: call instance bool [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>::MoveNext()
    L_0097: brtrue.s L_007f
    L_0099: leave.s L_00aa
    L_009b: ldloca.s enumerator
    L_009d: constrained. [mscorlib]System.Collections.Generic.List`1/Enumerator`0<class [mscorlib]System.Action>
    L_00a3: callvirt instance void [mscorlib]System.IDisposable::Dispose()
    L_00a8: nop 
    L_00a9: endfinally 
    L_00aa: ret 
    .try L_007d to L_009b finally handler L_009b to L_00aa
}

As you can see, a class is generated in compile-time, called <>c__DisplayClass0_0 which contains your Action and a value of char*. The class looks like that:

[CompilerGenerated]
private sealed class <>c__DisplayClass0_0
{
    // Fields
    public unsafe char* pointer;

    // Methods
    internal unsafe void <Main>b__0()
    {
        Console.WriteLine("Pointer address is {0:X2}.", (long) ((ulong) this.pointer));
    }
}

In the MSIL code we can see that the foreach is compiled to the following for loop:

shallowCloneOfArray = arrayOfPointers;
for (int num = 0; num < arrayOfPointers.Length; num++)
{
    <>c__DisplayClass0_0 temp = new <>c__DisplayClass0_0();
    temp.pointer = shallowCloneOfArray[num];
    list.Add(temp.<Main>b__0); //Adds the action to the list of actions
}

What it means it that the value of the pointer is actually copied when the loop is iterated and the delegates are created, so the value of pointer at the time is the one that will be printed (a.k.a: each action is from its own instance of <>c__DisplayClass0_0 and will receive its temporary cloned pointer).

As we just saw, the "reused variable" from before the foreach is the array itself, which means the the referenced pointers are not reused which means that if the specifications are as you are saying, than they are wrong since the specifications you attched suggest that the output should be 00 00 00 00. And the result:

List complete
Pointer address is 00.
Pointer address is 02.
Pointer address is 04.
Pointer address is C8.
like image 29
Tamir Vered Avatar answered Oct 05 '22 22:10

Tamir Vered