Consider the following XE6 code. The intention is that ThingData
should be written to the console for both Thing1
& Thing2
, but it is not. Why is that?
program BytesFiddle;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
type
TThing = class
private
FBuf : TBytes;
FData : TBytes;
function GetThingData: TBytes;
function GetThingType: Byte;
public
property ThingType : Byte read GetThingType;
property ThingData : TBytes read GetThingData;
constructor CreateThing(const AThingType : Byte; const AThingData: TBytes);
end;
{ TThing1 }
constructor TThing.CreateThing(const AThingType : Byte; const AThingData: TBytes);
begin
SetLength(FBuf, Length(AThingData) + 1);
FBuf[0] := AThingType;
Move(AThingData[0], FBuf[1], Length(AThingData));
FData := @FBuf[1];
SetLength(FData, Length(FBuf) - 1);
end;
function TThing.GetThingData: TBytes;
begin
Result := FData;
end;
function TThing.GetThingType: Byte;
begin
Result := FBuf[0];
end;
var
Thing1, Thing2 : TThing;
begin
try
Thing1 := TThing.CreateThing(0, TEncoding.UTF8.GetBytes('Sneetch'));
Thing2 := TThing.CreateThing(1, TEncoding.UTF8.GetBytes('Star Belly Sneetch'));
Writeln(TEncoding.UTF8.GetString(Thing2.ThingData));
Writeln(Format('Type %d', [Thing2.ThingType]));
Writeln(TEncoding.UTF8.GetString(Thing1.ThingData));
Writeln(Format('Type %d', [Thing1.ThingType]));
ReadLn;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Let me walk you through the ways in which this code fails and how the compiler allows you to shoot yourself in the foot.
If you step through the code using the debugger you can see what happens.
After the initialization of Thing1
you can see that FData
is filled with all zeros.
Strangely enough Thing2
is fine.
Therefore the error is in CreateThing
. Let's investigate further...
In the oddly named constructor CreateThing
you have the following line:
FData := @FBuf[1];
This looks like a simple assignment, but is really a call to DynArrayAssign
Project97.dpr.32: FData := @FBuf[1];
0042373A 8B45FC mov eax,[ebp-$04]
0042373D 83C008 add eax,$08
00423743 8B5204 mov edx,[edx+$04]
00423746 42 inc edx
00423747 8B0DE03C4000 mov ecx,[$00403ce0]
0042374D E8E66DFEFF call @DynArrayAsg <<-- lots of stuff happening here.
One of the checks DynArrayAsg
performs is to check whether the source dynamic array is empty or not.DynArrayAsg
also does a few other things which you need to be aware about.
Let's first have a look at the structure of a dynamic array; it's not just a simple pointer to an array!
Offset 32/64 | Contents
--------------+--------------------------------------------------------------
-8/-12 | 32 bit reference count
-4/-8 | 32 or 64 bit length indicator
0/ 0 | data of the array.
Performing FData = @FBuf[1]
you are messing up with the prefix fields of the dynamic array.
The 4 bytes in front of @Fbuf[1]
are interpreted as the length.
For Thing1 these are:
-8 (refcnt) -4 (len) 0 (data)
FBuf: 01 00 00 00 08 00 00 00 00 'S' 'n' .....
FData: 00 00 00 08 00 00 00 00 .............. //Hey that's a zero length.
Oops, when DynArrayAsg
starts investigating it sees that what it thinks is the source for the assign has a length of zero, i.e. it thinks the source is empty and does not assign anything. It leaves FData
unchanged!
Does Thing2
work as intended?
It looks like it does, but it actually fails in rather a bad way, let me show you.
You've successfully tricked the runtime into believing @FBuf[1]
is a valid reference to a dynamic array.
Because of this the FData
pointer has been updated to point to FBuf[1]
(so far so good), and the reference count of FData has been increased by 1 (not good), also the runtime has grown the memory block holding the dynamic array to what it thinks is the correct size for FData
(bad).
-8 (refcnt) -4 (len) 0 (data)
FBuf: 01 01 00 00 13 00 00 00 01 'S' 'n' .....
FData: 01 00 00 13 00 00 00 01 'S' ..............
Oops FData
now has a refcount of 318,767,105 and a length of 16,777,216 bytes.FBuf
also has its length increased, but its refcount is now 257.
This is why you need the call to SetLength
to undo the massive overallocation of memory. This still does not fix the reference counts though.
The overallocation may cause out of memory errors (esp. on 64-bit) and the wacky refcounts cause a memory leak because your arrays will never get freed.
The solution
As per David's answer: enable typed checked pointers: {$TYPEDADDRESS ON}
You can fix the code by defining FData
as a normal PAnsiChar
or PByte
.
If you make sure to always terminate your assignments to FBuf
with a double zero FData will work as expected.
Make FData
a TBuffer
like so:
TBuffer = record
private
FData : PByte;
function GetLength: cardinal;
function GetType: byte;
public
class operator implicit(const A: TBytes): TBuffer;
class operator implicit(const A: TBuffer): PByte;
property Length: cardinal read GetLength;
property DataType: byte read GetType;
end;
Rewrite CreateThing
like so:
constructor TThing.CreateThing(const AThingType : Byte; const AThingData: TBytes);
begin
SetLength(FBuf, Length(AThingData) + Sizeof(AThingType) + 2);
FBuf[0] := AThingType;
Move(AThingData[0], FBuf[1], Length(AThingData));
FBuf[Lengh(FBuf)-1]:= 0;
FBuf[Lengh(FBuf)-2]:= 0; //trailing zeros for compatibility with pansichar
FData := FBuf; //will call the implicit class operator.
end;
class operator TBuffer.implicit(const A: TBytes): TBuffer;
begin
Result.FData:= PByte(@A[1]);
end;
I don't understand all this mucking about trying to outsmart the compiler.
Why not just declare FData like so:
type
TMyData = record
DataType: byte;
Buffer: Ansistring;
....
And work with that.
The problem can be seen readily by enabling type-checked pointers. Add this to the top of your code:
{$TYPEDADDRESS ON}
The documentation says:
The $T directive controls the types of pointer values generated by the @ operator and the compatibility of pointer types.
In the {$T-} state, the result of the @ operator is always an untyped pointer (Pointer) that is compatible with all other pointer types. When @ is applied to a variable reference in the {$T+} state, the result is a typed pointer that is compatible only with Pointer and with other pointers to the type of the variable.
In the {$T-} state, distinct pointer types other than Pointer are incompatible (even if they are pointers to the same type). In the {$T+} state, pointers to the same type are compatible.
With that change your program fails to compile. This line fails:
FData := @FBuf[1];
The error message is:
E2010 Incompatible types:
'System.TArray<System.Byte>'
and'Pointer'
Now, FData
is of type TArray<Byte>
but @FBuf[1]
is not a dynamic array but rather a pointer to a byte in the middle of a dynamic array. The two are not compatible. By operating in the default mode where pointers are not type-checked, the compiler lets you commit this terrible mistake. Quite why this is the default mode is utterly beyond me.
A dynamic array is more than a pointer to the first element – there is also metadata such as length and reference count. That metadata is stored at an offset from the first element. Hence your entire design is flawed. Store the type code in a separate variable, and not as part of the dynamic array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With