Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why eax gives zero if it contains self?

Tags:

delphi

basm

According to the "Using Assembler in Delphi", eax will contain Self. However, the content of eax is 0 as shown. I wonder what is wrong ?

procedure TForm1.FormCreate(Sender: TObject);
var
  X, Y: Pointer;
begin
  asm
    mov X, eax
    mov Y, edx
  end;
  ShowMessage(IntToStr(NativeInt(X)) + ' ; ' + IntToStr(NativeInt(Y)));
end;
like image 671
SOUser Avatar asked Apr 17 '14 08:04

SOUser


1 Answers

The code generated when I compile this, under debug settings, is like so:

  begin
005A9414 55               push ebp
005A9415 8BEC             mov ebp,esp
005A9417 83C4E4           add esp,-$1c
005A941A 33C9             xor ecx,ecx
005A941C 894DEC           mov [ebp-$14],ecx
005A941F 894DE8           mov [ebp-$18],ecx
005A9422 894DE4           mov [ebp-$1c],ecx
005A9425 8955F0           mov [ebp-$10],edx
005A9428 8945F4           mov [ebp-$0c],eax
005A942B 33C0             xor eax,eax
005A942D 55               push ebp
005A942E 6890945A00       push $005a9490
005A9433 64FF30           push dword ptr fs:[eax]
005A9436 648920           mov fs:[eax],esp
  mov X, eax
005A9439 8945FC           mov [ebp-$04],eax
  mov Y, edx
005A943C 8955F8           mov [ebp-$08],edx

When the code starts executing, eax is indeed the self pointer. But the compiler has chosen to save it away to ebp-$0c and then zeroise eax. That's really up to the compiler.

The code under release settings is quite similar. The compiler still chooses to zeroise eax. Of course, you cannot rely on the compiler doing that.

  begin
005A82A4 55               push ebp
005A82A5 8BEC             mov ebp,esp
005A82A7 33C9             xor ecx,ecx
005A82A9 51               push ecx
005A82AA 51               push ecx
005A82AB 51               push ecx
005A82AC 51               push ecx
005A82AD 51               push ecx
005A82AE 33C0             xor eax,eax
005A82B0 55               push ebp
005A82B1 6813835A00       push $005a8313
005A82B6 64FF30           push dword ptr fs:[eax]
005A82B9 648920           mov fs:[eax],esp
  mov X, eax
005A82BC 8945FC           mov [ebp-$04],eax
  mov Y, edx
005A82BF 8955F8           mov [ebp-$08],edx

Remember that parameter passing defines the state of registers and stack when the function starts executing. What happens next, how the function decodes the parameters is down to the compiler. It is under no obligation to leave untouched the registers and stack that were used for parameter passing.

If you inject asm into the middle of a function, you cannot expect the volatile registers like eax to have particular values. They will hold whatever the compiler happened to put in them most recently.

If you want to examine the registers at the very beginning of the execution of the function, you need to use a pure asm function to be sure to avoid having the compiler modify the registers that were used for parameter passing:

var
  X, Y: Pointer;
asm
  mov X, eax
  mov Y, edx
  // .... do something with X and Y
end;

The compiler will make its choices very much dependent on the code in the rest of the function. For your code, the complexity of assembling the string to pass to ShowMessage causes quite a large preamble. Consider this code instead:

type
  TForm1 = class(TForm)
    procedure FormCreate(Sender: TObject);
  private
    i: Integer;
    function Sum(j: Integer): Integer;
  end;
....
procedure TForm1.FormCreate(Sender: TObject);
begin
  i := 624;
  Caption := IntToStr(Sum(42));
end;

function TForm1.Sum(j: Integer): Integer;
var
  X: Pointer;
begin
  asm
    mov X, eax
  end;
  Result := TForm1(X).i + j;
end;

In this case the code is simple enough for the compiler to leave eax alone. The optimised release build code for Sum is:

  begin
005A8298 55               push ebp
005A8299 8BEC             mov ebp,esp
005A829B 51               push ecx
  mov X, eax
005A829C 8945FC           mov [ebp-$04],eax
  Result := TForm4(X).i + j;
005A829F 8B45FC           mov eax,[ebp-$04]
005A82A2 8B80A0030000     mov eax,[eax+$000003a0]
005A82A8 03C2             add eax,edx
  end;
005A82AA 59               pop ecx
005A82AB 5D               pop ebp
005A82AC C3               ret 

And when you run the code, the form's caption is changed to the expected value.


To be perfectly honest, inline assembly, placed as an asm block inside a Pascal function, is not very useful. The thing about writing assembly is that you need to fully understand the state of the registers and the stack. that is well defined at the beginning and end of a function, defined by the ABI.

But in the middle of a function, that state depends entirely on the decisions made by the compiler. Injecting asm blocks in there requires you to know the decisions the compiler made. It also means that the compiler cannot understand the decisions that you made. This is usually impractical. Indeed for the x64 compiler Embarcadero banned such inline asm blocks. I personally have never used an inline asm block in my code. If ever I write asm I always write pure asm functions.

like image 107
David Heffernan Avatar answered Oct 31 '22 23:10

David Heffernan