Warning: This is merely an exercise for those whose are passionate about breaking stuff to understand their mechanics.
I was exploring the limits of what I could accomplish in C# and I wrote a ForceCast()
function to perform a brute-force cast without any type checks. Never consider using this function in production code.
I wrote a class called Original
and a struct called LikeOriginal
, both with two integer variables. In Main()
I created a new variable called orig
and set it to a new instance of Original
with a=7
and b=20
. When orig
is cast into LikeOriginal
and stored in casted
, the values of cG
and dG
become undefined, which is to be expected as LikeOriginal
is a struct and class instances contain more metadata than struct instances thus causing memory layout mismatch.
Example Output:
Casted Original to LikeOriginal
1300246376, 542
1300246376, 542
added 3
Casted LikeOriginal back to Original
1300246379, 545
Notice, however, that when I call casted.Add(3)
and cast back to Original
and print the values of a
and b
, surprisingly they are successfully incremented by 3, and this has been repeatable.
What is confusing me is the fact that casting the class to the struct will cause cG
and dG
to map to class metadata, but when they are modified and cast back to a class, they map correctly with a
and b
.
Why is this the case?
The code used:
using System;
using System.Runtime.InteropServices;
namespace BreakingStuff {
public class Original {
public int a, b;
public Original(int a, int b)
{
this.a = a;
this.b = b;
}
public void Add(int val)
{
}
}
public struct LikeOriginal {
public int cG, dG;
public override string ToString() {
return cG + ", " + dG;
}
public void Add(int val) {
cG += val;
dG += val;
}
}
public static class Program {
public unsafe static void Main() {
Original orig = new Original(7, 20);
LikeOriginal casted = ForceCast<Original, LikeOriginal>(orig);
Console.WriteLine("Casted Original to LikeOriginal");
Console.WriteLine(casted.cG + ", " + casted.dG);
Console.WriteLine(casted.ToString());
casted.Add(3);
Console.WriteLine("added 3");
orig = ForceCast<LikeOriginal, Original>(casted);
Console.WriteLine("Casted LikeOriginal back to Original");
Console.WriteLine(orig.a + ", " + orig.b);
Console.ReadLine();
}
//performs a pointer cast but with the same memory layout.
private static unsafe TOut ForceCast<TIn, TOut>(this TIn input) {
GCHandle handle = GCHandle.Alloc(input);
TOut result = Read<TOut>(GCHandle.ToIntPtr(handle));
handle.Free();
return result;
}
private static unsafe T Read<T>(this IntPtr address) {
T obj = default(T);
if (address == IntPtr.Zero)
return obj;
TypedReference tr = __makeref(obj);
*(IntPtr*) (&tr) = address;
return __refvalue(tr, T);
}
}
}
€dit: Long story short: first create a ForceCast function that correctly handles both identity translations ForceCast<LikeOriginal, LikeOriginal>
and ForceCast<Original, Original>
, then you might have a chance to get actual conversions working
By providing different codes for class->class (CC), class->struct (CS), struct->class (SC) and struct->struct (SS), using Nullable<T>
as intermediate for structs, I got a working example:
// class -> class
private static unsafe TOut ForceCastCC<TIn, TOut>(TIn input)
where TIn : class
where TOut : class
{
var handle = __makeref(input);
return Read<TOut>(*(IntPtr*)(&handle));
}
// struct -> struct, require nullable types for in-out
private static unsafe TOut? ForceCastSS<TIn, TOut>(TIn? input)
where TIn : struct
where TOut : struct
{
var handle = __makeref(input);
return Read<TOut?>(*(IntPtr*)(&handle));
}
// class -> struct
private static unsafe TOut? ForceCastCS<TIn, TOut>(TIn input)
where TIn : class
where TOut : struct
{
var handle = __makeref(input);
// one extra de-reference of the input pointer
return Read<TOut?>(*(IntPtr*)*(IntPtr*)(&handle));
}
// struct -> class
private static unsafe TOut ForceCastSC<TIn, TOut>(TIn? input)
where TIn : struct
where TOut : class
{
// get a real pointer to the struct, so it can be turned into a reference type
var handle = GCHandle.Alloc(input);
var result = Read<TOut>(GCHandle.ToIntPtr(handle));
handle.Free();
return result;
}
Now use the appropriate function in your sample and handle the nullable types like the compiler demands:
Original orig = new Original(7, 20);
LikeOriginal casted = ForceCastCS<Original, LikeOriginal>(orig) ?? default(LikeOriginal);
Console.WriteLine("Casted Original to LikeOriginal");
Console.WriteLine(casted.cG + ", " + casted.dG);
Console.WriteLine(casted.ToString());
casted.Add(3);
Console.WriteLine("added 3");
orig = ForceCastSC<LikeOriginal, Original>(casted);
Console.WriteLine("Casted LikeOriginal back to Original");
Console.WriteLine(orig.a + ", " + orig.b);
Console.ReadLine();
For me, this returns the correct numbers at each point.
Some details:
Basically, your problem is you treat a value type like a reference type...
Lets first look at the working case: LikeOriginal
-> Original
:
var h1 = GCHandle.Alloc(likeOriginal);
var ptr1 = GCHandle.ToIntPtr(h1);
This creates a pointer that points to the memory area of LikeOriginal
(€dit: actually, not really exactly that memory area, see below)
var obj1 = default(Original);
TypedReference t1 = __makeref(obj1);
*(IntPtr*)(&t1) = ptr1;
This creates a reference (pointer) to Original
with the value of a pointer, pointing to LikeOriginal
var original = __refvalue( t1,Original);
This turns the typed reference into a managed reference, pointing to the memory of LikeOriginal
. All values of the starting likeOriginal
object are retained.
Now lets analyze some intermediate case that should work, if your code would work bi-directional: LikeOriginal
-> LikeOriginal
:
var h2 = GCHandle.Alloc(likeOriginal);
var ptr2 = GCHandle.ToIntPtr(h2);
Again, we have a pointer that points to the memory area of LikeOriginal
var obj2 = default(LikeOriginal);
TypedReference t2 = __makeref(obj2);
Now here is the first hint of what is going wrong: __makeref(obj2)
will create a reference to the LikeOriginal
object, not to some separate area where the pointer is stored.
*(IntPtr*)(&t2) = ptr2;
ptr2
however, is a pointer to some reference value
var likeOriginal2 = __refvalue( t2,LikeOriginal);
Here we are, getting garbage because t2
would be supposed to be a direct reference to the object memory, instead of a reference to some pointer memory.
Following is some testcode I executed to get a better understanding of your approach and what goes wrong (some of it pretty structured, then some parts where I tried some additional things):
Original o1 = new Original(111, 222);
LikeOriginal o2 = new LikeOriginal { cG = 333, dG = 444 };
// get handles to the objects themselfes and to their individual properties
GCHandle h1 = GCHandle.Alloc(o1);
GCHandle h2 = GCHandle.Alloc(o1.a);
GCHandle h3 = GCHandle.Alloc(o1.b);
GCHandle h4 = GCHandle.Alloc(o2);
GCHandle h5 = GCHandle.Alloc(o2.cG);
GCHandle h6 = GCHandle.Alloc(o2.dG);
// get pointers from the handles, each pointer has an individual value
IntPtr i1 = GCHandle.ToIntPtr(h1);
IntPtr i2 = GCHandle.ToIntPtr(h2);
IntPtr i3 = GCHandle.ToIntPtr(h3);
IntPtr i4 = GCHandle.ToIntPtr(h4);
IntPtr i5 = GCHandle.ToIntPtr(h5);
IntPtr i6 = GCHandle.ToIntPtr(h6);
// get typed references for the objects and properties
TypedReference t1 = __makeref(o1);
TypedReference t2 = __makeref(o1.a);
TypedReference t3 = __makeref(o1.b);
TypedReference t4 = __makeref(o2);
TypedReference t5 = __makeref(o2.cG);
TypedReference t6 = __makeref(o2.dG);
// get the associated pointers
IntPtr j1 = *(IntPtr*)(&t1);
IntPtr j2 = *(IntPtr*)(&t2); // j1 != j2, because a class handle points to the pointer/reference memory
IntPtr j3 = *(IntPtr*)(&t3);
IntPtr j4 = *(IntPtr*)(&t4);
IntPtr j5 = *(IntPtr*)(&t5); // j4 == j5, because a struct handle points directly to the instance memory
IntPtr j6 = *(IntPtr*)(&t6);
// direct translate-back is working for all objects and properties
var r1 = __refvalue( t1,Original);
var r2 = __refvalue( t2,int);
var r3 = __refvalue( t3,int);
var r4 = __refvalue( t4,LikeOriginal);
var r5 = __refvalue( t5,int);
var r6 = __refvalue( t6,int);
// assigning the pointers that where inferred from the GCHandles
*(IntPtr*)(&t1) = i1;
*(IntPtr*)(&t2) = i2;
*(IntPtr*)(&t3) = i3;
*(IntPtr*)(&t4) = i4;
*(IntPtr*)(&t5) = i5;
*(IntPtr*)(&t6) = i6;
// translate back the changed references
var s1 = __refvalue( t1,Original); // Ok
// rest is garbage values!
var s2 = __refvalue( t2,int);
var s3 = __refvalue( t3,int);
var s4 = __refvalue( t4,LikeOriginal);
var s5 = __refvalue( t5,int);
var s6 = __refvalue( t6,int);
// a variation, primitively dereferencing the pointer to get to the actual memory
*(IntPtr*)(&t4) = *(IntPtr*)i4;
var s4_1 = __refvalue( t4,LikeOriginal); // partial result, getting { garbage, 333 } instead of { 333, 444 }
// prepare TypedReference for translation between Original and LikeOriginal
var obj1 = default(Original);
var obj2 = default(LikeOriginal);
TypedReference t7 = __makeref(obj1);
TypedReference t8 = __makeref(obj2);
// translate between Original and LikeOriginal
*(IntPtr*)(&t7) = i4; // From struct to class, the pointer aquired through GCHandle is apropriate
var s7 = __refvalue( t7,Original); // Ok
*(IntPtr*)(&t8) = *(IntPtr*)j1;
var s8 = __refvalue( t8,LikeOriginal); // Not Ok - Original has some value comming before its first member - getting { garbage, 111 } instead of { 111, 222 }
*(IntPtr*)(&t8) = j2;
var s9 = __refvalue( t8,LikeOriginal); // Ok by starting at the address of the first member
Conclusion: Going via GCHandle
-> IntPtr
is creating a pointer that is pointing to one memory location in front of the first member, no matter whether the starting point is a struct or a class. This results in a situation, where struct -> class or class -> class is working but class -> struct or struct -> struct is not working.
The only way I found for targeting structs is to get a pointer to their first member (which in case of an input struct equals the __makeref
to the struct without going via GCHandle
).
Here is how I see this situation. You have acted upon the reference to Original
as if it were a reference to LikeOriginal
. Critical point here is that you are invoking LikeOriginal.Add()
method, the address of which is resolved statically during compile time.
This method, in turn, operates on a this
reference which it implicitly receives. Therefore, it modifies values which are offset by 0 and by 4 bytes relative to this
reference it has in its hands.
Since this experiment worked out, it indicates that the layouts of Original
object and LikeOriginal
struct are the same. I know that structs have flat layout, which makes them useful when allocating arrays of structs - there will be nothing inserted into the sequence of bytes representing flat content of structs. That is precisely what doesn't stand for classes - they need one reference which will be used to resolve virtual functions and type at run time.
Which reminds me to say that the lacking of this added reference is the core reason why structs do not support derivation - you wouldn't know whether you have a base or derived struct in a later call.
Anyway, back to the surprising fact that this code worked fine. I have been working with C++ compilers and I remember that they used to put the v-table pointer before actual data content of the object. In other words, this
pointer used to point 4 bytes after actual address of the memory block allocated for that object. Maybe C# is doing the same, in which case this
reference in a method invoked on Original
points to a
, just like the this
reference in a method invoked on LikeOriginal
points to cG
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With