Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing string length generates more code than comparing to empty string?

Tags:

delphi

In Delphi, string <> '' seems to generate less code than Length(string) > 0.

Comparing for empty string, defined in TMyClass.UpdateString(const strMyString : String):

MyClassU.pas.31: begin
005CE6A0 55               push ebp                           
005CE6A1 8BEC             mov ebp,esp
005CE6A3 83C4F8           add esp,-$08
005CE6A6 8955F8           mov [ebp-$08],edx
005CE6A9 8945FC           mov [ebp-$04],eax
MyClassU.pas.32: if (strMyString <> '') then
005CE6AC 837DF800         cmp dword ptr [ebp-$08],$00
005CE6B0 740E             jz $005ce6c0

As I understand it, this is comparing the address of the dynamically allocated string ([ebp-$08]) to zero. Makes sense, since empty strings point to nil.

Comparing for length, defined in TMyClass.UpdateString2(const strMyString : String):

MyClassU.pas.25: begin
005CE664 55               push ebp
005CE665 8BEC             mov ebp,esp
005CE667 83C4F4           add esp,-$0c
005CE66A 8955F8           mov [ebp-$08],edx
005CE66D 8945FC           mov [ebp-$04],eax
005CE670 8B45F8           mov eax,[ebp-$08]
MyClassU.pas.26: if (Length(strMyString) > 0) then
005CE673 8945F4           mov [ebp-$0c],eax
005CE676 837DF400         cmp dword ptr [ebp-$0c],$00
005CE67A 740B             jz $005ce687
005CE67C 8B45F4           mov eax,[ebp-$0c]
005CE67F 83E804           sub eax,$04
005CE682 8B00             mov eax,[eax]
005CE684 8945F4           mov [ebp-$0c],eax
005CE687 837DF400         cmp dword ptr [ebp-$0c],$00
005CE68B 7E0E             jle $005ce69b

What? Should't it just be cmp dword ptr [ebp-$04],$00, as the string length is stored at offset -$04 within the string?

My guess is it's because optimizations were off and the compiler did not optimize Lenght (boils down to PInteger(PByte(S) - 4)^), but I don't understand why there are two comparisons. In fact both comparisons are present even with optimizations turned on:

MyClassU.pas.27: if (Length(strMyString) > 0) then
005CE6B1 8BC6             mov eax,esi
005CE6B3 85C0             test eax,eax
005CE6B5 7405             jz $005ce6bc
005CE6B7 83E804           sub eax,$04
005CE6BA 8B00             mov eax,[eax]
005CE6BC 85C0             test eax,eax
005CE6BE 7E0A             jle $005ce6ca

vs

MyClassU.pas.33: if (strMyString <> '') then
005CE6D9 85F6             test esi,esi
005CE6DB 740A             jz $005ce6e7
like image 883
afarah Avatar asked Dec 17 '22 20:12

afarah


1 Answers

The second block of code does more work, and not surprisingly that takes more code.

In the first block of code you simply compare against the empty string. The compiler knows that is equivalent to comparing the pointer against nil and generates that code.

The second block of code first obtains the length of the string. That involves checking whether the pointer is nil. If it is, then the length is zero. Otherwise the length is read from the string meta data record.

The compiler simply does not know that every time the pointer is not nil, the length must be positive and so is not able to optimise.

As for why Length doesn't read from the string record directly, that should be obvious now. An empty string is implemented as the nil pointer and so has no string record. In order to find the length you need to deal with two different cases:

  1. String is empty, length is 0.
  2. String is not empty, length is read from the string record.
like image 50
David Heffernan Avatar answered Dec 24 '22 02:12

David Heffernan