Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delphi optimisation : constant loop

I just noticed something quite interesting in a program I'm writing. I have a simple procedure that populates a TStringlist with objects of type x.

I added a breakpoint as I was tracing an issue and stumbled across this ans was hoping someone might be able to explain why it is happening, or link to a relevant document as I couldn't find anything.

My loop goes from 0 - 11. The pointer that I'm using was initialised in the loop by for nPtr := 0 but when the program was run the nPtr var was going from 12 down to 1. I then initialised the var outside the loop as shown in the code snippet but the same thing happened. The variable is used nowhere else in the unit.

I asked one of the guys I worked with who said it was due to Delphi optimisation but I'd like to know why and how it decides which loop should be affected.

Thanks for any help.

Code:

procedure TUnit.ProcedureName;
var
    nPtr : Integer;
    obj : TObject;
begin
nPtr:=0;//added later
for nPtr := 0 to 11 do
    begin
    obj := TObject.Create(Self);
    slPeriodList.AddObject('X', obj);
    end;
end;
like image 869
null Avatar asked Feb 02 '15 14:02

null


1 Answers

The optimization is only possible if the loop body does not refer to the loop variable. In that case, if the lower bound of the loop is zero, then the compiler will reverse the loop.

If the loop variable is never referenced by the loop body then the compiler is justified in implementing the loop however it pleases. All it is required to do is execute the loop body as many times as is mandated by the loop bounds. Indeed, the compiler would be perfectly justified in optimizing away the loop variable.

Consider this program:

{$APPTYPE CONSOLE}

procedure Test1;
var
  i: Integer;
begin
  for i := 0 to 11 do
    Writeln(0);
end;

procedure Test2;
var
  i: Integer;
begin
  for i := 0 to 11 do
    Writeln(i);
end;

begin
  Test1;
  Test2;
end.

The body of Test1 is compiled to this code by XE7, 32 bit Windows compiler, with release options:

Project1.dpr.9: for i := 0 to 11 do
00405249 BB0C000000       mov ebx,$0000000c
Project1.dpr.10: Writeln(0);
0040524E A114784000       mov eax,[$00407814]
00405253 33D2             xor edx,edx
00405255 E8FAE4FFFF       call @Write0Long
0040525A E8D5E7FFFF       call @WriteLn
0040525F E800DBFFFF       call @_IOTest
Project1.dpr.9: for i := 0 to 11 do
00405264 4B               dec ebx
00405265 75E7             jnz $0040524e

The compiler is running the loop downwards, as can be seen by the use of dec. Notice that the test for loop termination is performed with jnz with no need for a cmp. That is because dec performs an implicit compare against zero.

The documentation for dec says the following:

Flags Affected

The CF flag is not affected. The OF, SF, ZF, AF, and PF flags are set according to the result.

The ZF flag is set if and only if the result of the dec instruction is zero. And the ZF is what determines whether or not jnz branches.

The code emitted for Test2 is:

Project1.dpr.17: for i := 0 to 11 do
0040526D 33DB             xor ebx,ebx
Project1.dpr.18: Writeln(i);
0040526F A114784000       mov eax,[$00407814]
00405274 8BD3             mov edx,ebx
00405276 E8D9E4FFFF       call @Write0Long
0040527B E8B4E7FFFF       call @WriteLn
00405280 E8DFDAFFFF       call @_IOTest
00405285 43               inc ebx
Project1.dpr.17: for i := 0 to 11 do
00405286 83FB0C           cmp ebx,$0c
00405289 75E4             jnz $0040526f

Note that the loop variable is increasing, and we now have an extra cmp instruction, executed on every loop iteration.

It is perhaps interesting to note that the 64 bit Windows compiler does not include this optimization. For Test1 it produces this:

Project1.dpr.9: for i := 0 to 11 do
00000000004083A5 4833DB           xor rbx,rbx
Project1.dpr.10: Writeln(0);
00000000004083A8 488B0D01220000   mov rcx,[rel $00002201]
00000000004083AF 4833D2           xor rdx,rdx
00000000004083B2 E839C3FFFF       call @Write0Long
00000000004083B7 4889C1           mov rcx,rax
00000000004083BA E851C7FFFF       call @WriteLn
00000000004083BF E86CB4FFFF       call @_IOTest
00000000004083C4 83C301           add ebx,$01
Project1.dpr.9: for i := 0 to 11 do
00000000004083C7 83FB0C           cmp ebx,$0c
00000000004083CA 75DC             jnz Test1 + $8

I'm not sure why this optimization has not been implemented in the 64 bit compiler. My guess would be that the optimization has negligible effect in real world cases and the designers chose not to expend effort implementing it for the 64 bit compiler.

like image 85
David Heffernan Avatar answered Oct 21 '22 14:10

David Heffernan