Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does mscorlib's `System.Boolean` avoid struct layout cycles?

Tags:

c#

The source-code for System.Boolean at the Reference Source website states that instances of the struct Boolean contain only a single bool field: private bool m_value:

https://referencesource.microsoft.com/#mscorlib/system/boolean.cs,f1b135ff6c380b37

namespace System {

    using System;
    using System.Globalization;
    using System.Diagnostics.Contracts;

    [Serializable]
    [System.Runtime.InteropServices.ComVisible(true)]
    public struct Boolean : IComparable, IConvertible
#if GENERICS_WORK
        , IComparable<Boolean>,  IEquatable<Boolean>
#endif
    {
      private bool m_value;

      internal const int True = 1; 
      internal const int False = 0; 

      internal const String TrueLiteral  = "True";
      internal const String FalseLiteral = "False";

      public static readonly String TrueString  = TrueLiteral;
      public static readonly String FalseString = FalseLiteral;
}

But I noticed that...

  • bool is a C# language alias for System.Boolean.
  • The type is struct Boolean which is a value-type which means it cannot contain itself as a field.
  • ...yet this code presumably compiles.
  • I understand that when the -nostdlib compiler option is set you need to provide your own essential type definitions like System.String, System.Int32, System.Exception - that's the only difference.
  • The published source-code contains no other special attributes like [MethodImpl( MethodImplOptions.InternalCall )].

So how does this code compile?

like image 454
Dai Avatar asked Dec 04 '19 02:12

Dai


People also ask

Does layoutkind control the layout in managed memory?

For blittable types, LayoutKind.Sequential controls both the layout in managed memory and the layout in unmanaged memory. For non-blittable types, it controls the layout when the class or structure is marshaled to unmanaged code, but does not control the layout in managed memory.

Why is the Bool field offset 0 in the managed view?

In the native view, the bool field is always at offset 0 as expected because StructLayout is respected. The managed view is under control of the managed runtime and is not suitable to be exposed to unmanaged code. Sorry, something went wrong. Why would we want to cement a behavior with actual perf disadvantages in docs?

What happens when a struct is non-blittable?

Once the struct is considered non-blittable, the runtime checks if the struct is disqualified from being managed sequential. Since one of the fields is not managed sequential (it's explicit), the type is disqualified from having a sequential managed layout. Sorry, something went wrong.

Is the layout of boolandexplicitstruct sequential?

However if you replace the bool field in BoolAndExplicitStruct with a decimal, the layout is still not sequential. Clearly something is wrong with at least the documentation. Sorry, something went wrong.


1 Answers

Short answer: It's a special case, relating to type boxing and their underlying representation. These types are well-known to the compiler and as such are treated slightly differently by core parts of the runtime and the compiler/JIT optimizer compared to regular types.


Since this is buried deep in the runtime implementation, I would presume the language specification would not go into specific runtime implementation details. I'm not sure if this is a satisfactory enough answer but I think in this particular case, the bool type remains unboxed and thus exists as a raw value type as part of the structure.

The semantics of boxing and unboxing of value types are intentionally opaque to make using the language easier. In this case the Boolean structure itself seems to rely on implementation specific boxing rules to implement the actual semantics such as:

  // Determines whether two Boolean objects are equal.
  public override bool Equals (Object obj) {
    //If it's not a boolean, we're definitely not equal
    if (!(obj is Boolean)) {
      return false;
    }

    return (m_value==((Boolean)obj).m_value);
  }

I believe in the above, a boxed structure representing a boolean type is first type-checked followed by it being unboxed and the internal bool value being directly compared. Unlike a boxed type, which may be a tagged pointer or an actual structure with some runtime type information, unboxed types are treated as actual data.

I believe internally, if a bool had to be boxed to be passed off as System.Object (because of type erasure or where no optimization would possible) you would end up with something along the lines of this for true which boxes the value 1.

ldc.i4.1
box        [mscorlib]System.Boolean

So while on a high level bool and System.Boolean appear to be identical and may be optimized similarly, in this particular case within the runtime, the distinctions between the boxed and unboxed versions of bool are directly exposed. Similary, an unboxed bool cannot be compared to System.Object which is inherently a boxed type. This answer regarding the need for boxing/unboxing goes into a lot more depth as far as explaining the principle itself.

In managed languages runtime implementations generally need to be exempt from certain rules when it comes to some core runtime features, this is certainly true for Java and other JVM based languages. While I'm not familiar with CLR as well, I would think the same principle applied here.

While this question about 'bool' being a type alias for 'System.Boolean' essentially covers general use cases, when getting close to the runtime implementation, the dialect of C# becomes more like "implementation specific C#", which can bend the rules slightly.

like image 84
Kristina Brooks Avatar answered Oct 16 '22 10:10

Kristina Brooks