Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can another thread see partially created collection when using collection initializer?

Imagine this C# code in some method:

SomeClass.SomeGlobalStaticDictionary = new Dictionary<int, string>()
{
    {0, "value"},
};

Let's say no one is using any explicit memory barriers or locking to access the dictionary.

If no optimization takes place, then the global dictionary should be either null (initial value) or a properly constructed dictionary with one entry.

The question is: Can the effect of the Add call and assigning to SomeGlobalStaticDictionary be reordered such that some other thread would see an empty non-null SomeGlobalStaticDictionary (or any other invalid partially constructed dictionary?)

Does the answer change if SomeGlobalStaticDictionary is volatile?

After reading http://msdn.microsoft.com/en-us/magazine/jj863136.aspx (and also its second part) I learned that in theory just because one variable is assigned in source code other threads might see it differently due to many reasons. I looked at the IL code but the question is whether the JIT compiler and/or CPU are allowed to not "flush" the effect of the Add call to other threads before the assignment of the SomGlobalStaticDictionary.

like image 406
Palo Avatar asked Apr 22 '13 16:04

Palo


Video Answer


2 Answers

In local variables, with optimization turned on, the compiler will (at least sometimes) compile to code which first assigns to the variable, then calls Add (or sets properties, for object initializers).

If you use a static or an instance variable, you'll see different behaviour:

class Test
{
    static List<int> StaticList = new List<int> { 1 };
    List<int> InstanceList = new List<int> { 2 };
}

Gives the following type initializer IL:

.method private hidebysig specialname rtspecialname static 
        void  .cctor() cil managed
{
  // Code size       21 (0x15)
  .maxstack  2
  .locals init (class [mscorlib]System.Collections.Generic.List`1<int32> V_0)
  IL_0000:  newobj     instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
  IL_0005:  stloc.0
  IL_0006:  ldloc.0
  IL_0007:  ldc.i4.1
  IL_0008:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
  IL_000d:  nop
  IL_000e:  ldloc.0
  IL_000f:  stsfld     class [mscorlib]System.Collections.Generic.List`1<int32> Test::StaticList
  IL_0014:  ret
} // end of method Test::.cctor

And the following constructor IL:

.method public hidebysig specialname rtspecialname 
        instance void  .ctor() cil managed
{
  // Code size       29 (0x1d)
  .maxstack  3
  .locals init (class [mscorlib]System.Collections.Generic.List`1<int32> V_0)
  IL_0000:  ldarg.0
  IL_0001:  newobj     instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldc.i4.2
  IL_0009:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
  IL_000e:  nop
  IL_000f:  ldloc.0
  IL_0010:  stfld      class [mscorlib]System.Collections.Generic.List`1<int32> Test::InstanceList
  IL_0015:  ldarg.0
  IL_0016:  call       instance void [mscorlib]System.Object::.ctor()
  IL_001b:  nop
  IL_001c:  ret
} // end of method Test::.ctor

In both cases, the collection is populated before the field is set. Now that's not to say that there may not still be memory model issues, but it's not the same as the field being set to refer to an empty collection and then the Add call being made. From the perspective of the assigning thread, the assignment happens after the Add.

In general, both object initializer and collection initializer expressions are equivalent to constructing the object using a temporary variable - so in the case where you use it in an assignment, the property setters are all called before the assignment takes place.

However, I don't believe any special guarantees are given around visibility to other threads for object/collection initializers. I would suggest that you imagine what the code would look like if written out "long-hand" according to the specification, and then reason from there.

There are guarantees given for static initializers and constructors - but primarily within the Microsoft implementation of .NET rather than "general" guarantees (e.g. within the C# specification or the ECMA spec).

like image 137
Jon Skeet Avatar answered Sep 22 '22 00:09

Jon Skeet


Let me start by saying that I do not know the answer to your question, but I can help you simplify it down to its essence:

unsafe class C
{
    static int x;  // Assumed to be initialized to zero
    static int *p; // Assumed to be initialized to null
    static void M()
    {
        int* t = &C.x;
        *t = 1;
        C.p = t;
    }
    ...

Here int is standing in for the dictionary, p is standing in for your field that references a dictionary, t is the temporary created, and adding an element to the dictionary is modeled as mutating the value of field x. So the sequence of events here is: obtain storage for the dictionary and save that in a temporary, then mutate the thing referred to, and then publish the result.

The question is whether under the C# memory model, an observer on another thread is permitted to see that C.p is pointing to x and that x is still zero.

Like I said, I do not know for certain the answer to that; I would be interested to find out.

Off the top of my head though: why should that not be possible? p and x can be on completely different pages of memory. Suppose on some processor the value of x has been pre-fetched but p has not. Could that processor observe that p is not null but x is still zero? What's stopping that?

like image 23
Eric Lippert Avatar answered Sep 19 '22 00:09

Eric Lippert