Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are the "primitive" types defined non-recursively?

Tags:

Since a struct in C# consists of the bits of its members, you cannot have a value type T which includes any T fields:

// Struct member 'T.m_field' of type 'T' causes a cycle in the struct layout
struct T { T m_field; }

My understanding is that an instance of the above type could never be instantiated*—any attempt to do so would result in an infinite loop of instantiation/allocation (which I guess would cause a stack overflow?**)—or, alternately, another way of looking at it might be that the definition itself just doesn't make sense; perhaps it's a self-defeating entity, sort of like "This statement is false."

Curiously, though, if you run this code:

BindingFlags privateInstance = BindingFlags.NonPublic | BindingFlags.Instance;

// Give me all the private instance fields of the int type.
FieldInfo[] int32Fields = typeof(int).GetFields(privateInstance);

foreach (FieldInfo field in int32Fields)
{
    Console.WriteLine("{0} ({1})", field.Name, field.FieldType);
}

...you will get the following output:

m_value (System.Int32)

It seems we are being "lied" to here***. Obviously I understand that the primitive types like int, double, etc. must be defined in some special way deep down in the bowels of C# (you cannot define every possible unit within a system in terms of that system... can you?—different topic, regardless!); I'm just interested to know what's going on here.

How does the System.Int32 type (for example) actually account for the storage of a 32-bit integer? More generally, how can a value type (as a definition of a kind of value) include a field whose type is itself? It just seems like turtles all the way down.

Black magic?


*On a separate note: is this the right word for a value type ("instantiated")? I feel like it carries "reference-like" connotations; but maybe that's just me. Also, I feel like I may have asked this question before—if so, I forget what people answered.

**Both Martin v. Löwis and Eric Lippert have pointed out that this is neither entirely accurate nor an appropriate perspective on the issue. See their answers for more info.

***OK, I realize nobody's actually lying. I didn't mean to imply that I thought this was false; my suspicion had been that it was somehow an oversimplification. After coming to understand (I think) thecoop's answer, it makes a lot more sense to me.

like image 737
Dan Tao Avatar asked Jan 20 '11 19:01

Dan Tao


2 Answers

As far as I know, within a field signature that is stored in an assembly, there are certain hardcoded byte patterns representing the 'core' primitive types - the signed/unsigned integers, and floats (as well as strings, which are reference types and a special case). The CLR knows natively how to deal with those. Check out Partition II, section 23.2.12 of the CLR spec for the bit patterns of the signatures.

Within each primitive struct ([mscorlib]System.Int32, [mscorlib]System.Single etc) in the BCL is a single field of that native type, and because a struct is exactly the same size as its constituent fields, each primitive struct is the same bit pattern as its native type in memory, and so can be interpreted as either, by the CLR, C# compiler, or libraries using those types.

From C#, int, double etc are synonyms of the mscorlib structs, which each have their primitive field of a type that is natively recognised by the CLR.

(There's an extra complication here, in that the CLR spec specifies that any types that have a 'short form' (the native CLR types) always have to be encoded as that short form (int32), rather than valuetype [mscorlib]System.Int32. So the C# compiler knows about the primitive types as well, but I'm not sure of the exact semantics and special-casing that goes on in the C# compiler and CLR for, say, method calls on primitive structs)

So, due to Godel's Incompleteness Theorem, there has to be something 'outside' the system by which it can be defined. This is the Magic that lets the CLR interpret 4 bytes as a native int32 or an instance of [mscorlib]System.Int32, which is aliased from C#.

like image 134
thecoop Avatar answered Sep 24 '22 22:09

thecoop


My understanding is that an instance of the above type could never be instantiated any attempt to do so would result in an infinite loop of instantiation/allocation (which I guess would cause a stack overflow?)—or, alternately, another way of looking at it might be that the definition itself just doesn't make sense;

That's not the best way of characterizing the situation. A better way to look at it is that the size of every struct must be well-defined. An attempt to determine the size of T goes into an infinite loop, and therefore the size of T is not well-defined. Therefore, it's not a legal struct because every struct must have a well-defined size.

It seems we are being lied to here

There's no lie. An int is a struct that contains a field of type int. An int is of known size; it is by definition four bytes. Therefore it is a legal struct, because the size of all its fields is known.

How does the System.Int32 type (for example) actually store a 32-bit integer value

The type doesn't do anything. The type is just an abstract concept. The thing that does the storage is the CLR, and it does so by allocating four bytes of space on the heap, on the stack, or in registers. How else do you suppose a four-byte integer would be stored, if not in four bytes of memory?

how does the System.Type object referenced with typeof(int) present itself as though this value is itself an everyday instance field typed as System.Int32?

That's just an object, written in code like any other object. There's nothing special about it. You call methods on it, it returns more objects, just like every other object in the world. Why do you think there's something special about it?

like image 29
Eric Lippert Avatar answered Sep 24 '22 22:09

Eric Lippert