I want to use a <code>DivMod</code> function that operates exclusively on 32 bit operands. The implementation in the RTL returns values in 16 bit variables. Its declaration is: <pre class="prettyprint"><code>procedure DivMod(Dividend: Cardinal; Divisor: Word; var Result, Remainder: Word); </code></pre> So, I cannot use that since my inputs may overflow the return values. The naive Pascal implementation looks like this: <pre class="prettyprint"><code>procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal); begin Quotient := Dividend div Divisor; Remainder := Dividend mod Divisor; end; </code></pre> This works splendidly but performs the division twice. Since the function is called by part of my code that is in a performance bottleneck, I would like to perform the division once only. To that end I am using Serg's 32 bit DivMod from this question: Is there a DivMod that is *not* Limited to Words (<=65535)? <pre class="prettyprint"><code>procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal); asm PUSH EBX MOV EBX,EDX XOR EDX,EDX DIV EBX MOV [ECX],EAX MOV EBX,Remainder MOV [EBX],EDX POP EBX end; </code></pre> This works perfectly. But now I would like a version of the function for 64 bit code. Note that I still want to operate on 32 bit operands, and return 32 bit values. Should I re-write the function using 64 bit assembler, or is it sufficient to use the <code>DivMod</code> overload from the RTL that operates on, and returns, 64 bit values? Specifically I would like to know if there is a performance benefit in writing 64 bit code that does 32 bit operations. Is that even possible? Or would I simply end up re-implementing the <code>DivMod</code> overload with <code>UInt64</code> parameters? If it is worth implementing a bespoke 64 bit asm version, how would I go about doing it, noting that the operands and operations are 32 bit. I think that it would look like this, but I am no expert and likely have got something wrong: <pre class="prettyprint"><code>procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal); asm MOV EAX,ECX // move Dividend to EAX MOV ECX,EDX // move Divisor to ECX XOR EDX,EDX // zeroise EDX DIV ECX // divide EDX:EAX by ECX MOV [R8],EAX // save quotient MOV [R9],EDX // save remainder end; </code></pre>

For the special case of always dividing by 10 (per comments) you can do something like this : <pre class="prettyprint"><code>procedure DivMod10(num : Cardinal; var q, r : Cardinal); inline; var rl : uInt64; begin rl := UInt64(3435973837)*num; q := rl shr 35; r := num - q*10; end; </code></pre> The algorithm varies depending on the denominator, but the source for determining it and the magic numbers can be found in libdivide. This is tested accurate for all unsigned 32-bit integers and is about 3 times faster than using <code>div</code> (and provides the remainder). Benchmark (optimizations on): <pre class="prettyprint"><code> t0 := GetTickCount; for I := 1 to 999999999 do begin DivMod10(i, q, r); end; ShowMessage(IntToStr(GetTickCount - t0)); // result : 1809 t0 := GetTickCount; for I := 1 to 999999999 do begin q := i div 10; end; ShowMessage(IntToStr(GetTickCount - t0)); // result : 5336 </code></pre> Test : <pre class="prettyprint"><code>for I := 1 to High(Cardinal) do begin DivMod10(i,q,r); if q <> (i div 10) then WriteLn(IntToStr(i)); // no mismatch found end; </code></pre>

I dug a bit deeper. I think it would be perfectly reasonably to implement this on top of the <code>UInt64</code> version. That would look like this: <pre class="prettyprint"><code>procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal); var Quotient64, Remainder64: UInt64; begin DivMod(Dividend, Divisor, Quotient64, Remainder64); Quotient := Quotient64; Remainder := Remainder64; end; </code></pre> I don't think the performance would be very significantly affected in comparison to the most optimal asm version. However, I believe that the x64 asm code in the question is correct. The <code>MOV</code> instructions are all fine with 32 bit operands. And the <code>DIV</code> is also as described in the comment in the asm code. The Intel documentation for <code>DIV r/m32</code> says: <blockquote> Unsigned divide EDX:EAX by r/m32, with result stored in EAX ← Quotient, EDX ← Remainder. </blockquote> And let's take a look at what the Delphi compiler does with this code: <pre class="prettyprint"><code>var a, b, c, d: Cardinal; .... a := 666; b := 42; c := a div b; d := a mod b; </code></pre> The code that is produced is: <pre class="prettyprint"> Project39.dpr.14: a := 666; 0000000000423A68 C7450C9A020000 mov [rbp+$0c],$0000029a Project39.dpr.15: b := 42; 0000000000423A6F C745082A000000 mov [rbp+$08],$0000002a Project39.dpr.16: c := a div b; 0000000000423A76 8B450C mov eax,[rbp+$0c] 0000000000423A79 33D2 xor edx,edx 0000000000423A7B F77508 div dword ptr [rbp+$08] 0000000000423A7E 894504 mov [rbp+$04],eax Project39.dpr.17: d := a mod b; 0000000000423A81 8B450C mov eax,[rbp+$0c] 0000000000423A84 33D2 xor edx,edx 0000000000423A86 F77508 div dword ptr [rbp+$08] 0000000000423A89 895500 mov [rbp+$00],edx </pre> I don't have any expectation that the 32 bit divide will be more efficient than a 64 bit divide, but that doesn't really matter. It seems more natural to perform the 32 bit operation with 32 bit operands.

How do I implement an efficient 32 bit DivMod in 64 bit code

Q: How to install a 32-bit program on a 64-bit PC?

The process of installing a 32-bit program on a 64-bit PC is the same as installing it on a 32-bit computer. Secondly, double-click on the shortcut of the installed 32-bit software to open it. Or, you can right-click on it and select Open.

Q: What is the difference between 32-bit and 64-bit programs?

Usually, a 32-bit program is designed for 32-bit Windows and also for 64-bit systems. Yet, a 64-bit program is only developed to run on 64-bit Windows. A 32-bit can run on a 64-bit Windows without accessing all the features and memory that a 64-bit can offer.

Q: What is the difference between Divmod 8 and Divmod 5?

divmod (8, 3) = (2, 2) divmod (3, 8) = (0, 3) divmod (5, 5) = (1, 0) Here, we have used the divmod () method with integer arguments divmod (8,3) - computes quotient and remainder of 8/3 - results in (2, 2) divmod (3,8) - computes quotient and remainder of 3/8 - results in (0, 3)

Q: Can I convert a 32-bit DLL file to a 64-bit file?

There might be a workaround Accessing 32-bit DLLs from 64-bit code [ ^ ], but it comes with a lot of strings attached. You might have to re-evaluate your reasons for wanting to convert to 64-bit. Correct, my 5.

Tags:

assembly

x86-64

delphi

I want to use a DivMod function that operates exclusively on 32 bit operands. The implementation in the RTL returns values in 16 bit variables. Its declaration is:

procedure DivMod(Dividend: Cardinal; Divisor: Word; var Result, Remainder: Word);

So, I cannot use that since my inputs may overflow the return values.

The naive Pascal implementation looks like this:

procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
begin
  Quotient := Dividend div Divisor;
  Remainder := Dividend mod Divisor;
end;

This works splendidly but performs the division twice. Since the function is called by part of my code that is in a performance bottleneck, I would like to perform the division once only. To that end I am using Serg's 32 bit DivMod from this question: Is there a DivMod that is *not* Limited to Words (<=65535)?

procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
asm
        PUSH EBX
        MOV  EBX,EDX
        XOR  EDX,EDX
        DIV  EBX
        MOV  [ECX],EAX
        MOV  EBX,Remainder
        MOV  [EBX],EDX
        POP  EBX
end;

This works perfectly.

But now I would like a version of the function for 64 bit code. Note that I still want to operate on 32 bit operands, and return 32 bit values.

Should I re-write the function using 64 bit assembler, or is it sufficient to use the DivMod overload from the RTL that operates on, and returns, 64 bit values?

Specifically I would like to know if there is a performance benefit in writing 64 bit code that does 32 bit operations. Is that even possible? Or would I simply end up re-implementing the DivMod overload with UInt64 parameters? If it is worth implementing a bespoke 64 bit asm version, how would I go about doing it, noting that the operands and operations are 32 bit.

I think that it would look like this, but I am no expert and likely have got something wrong:

procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
asm
        MOV   EAX,ECX   // move Dividend to EAX
        MOV   ECX,EDX   // move Divisor to ECX
        XOR   EDX,EDX   // zeroise EDX
        DIV   ECX       // divide EDX:EAX by ECX
        MOV   [R8],EAX  // save quotient
        MOV   [R9],EDX  // save remainder
end;

936

asked Nov 28 '13 16:11

David Heffernan

2 Answers

For the special case of always dividing by 10 (per comments) you can do something like this :

procedure DivMod10(num : Cardinal; var q, r : Cardinal); inline;
var
  rl : uInt64;
begin
  rl := UInt64(3435973837)*num;
  q := rl shr 35;
  r := num - q*10;
end;

The algorithm varies depending on the denominator, but the source for determining it and the magic numbers can be found in libdivide. This is tested accurate for all unsigned 32-bit integers and is about 3 times faster than using div (and provides the remainder).

Benchmark (optimizations on):

  t0 := GetTickCount;
  for I := 1 to 999999999 do begin
    DivMod10(i, q, r);
  end;
  ShowMessage(IntToStr(GetTickCount - t0));  // result :  1809

  t0 := GetTickCount;
  for I := 1 to 999999999 do begin
    q := i div 10;
  end;
  ShowMessage(IntToStr(GetTickCount - t0));  // result :  5336

Test :

for I := 1 to High(Cardinal) do begin
  DivMod10(i,q,r);
  if q <> (i div 10) then WriteLn(IntToStr(i));
  // no mismatch found
end;

136

answered Oct 21 '22 08:10

J...

I dug a bit deeper. I think it would be perfectly reasonably to implement this on top of the UInt64 version. That would look like this:

procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
var
  Quotient64, Remainder64: UInt64;
begin
  DivMod(Dividend, Divisor, Quotient64, Remainder64);
  Quotient := Quotient64;
  Remainder := Remainder64;
end;

I don't think the performance would be very significantly affected in comparison to the most optimal asm version.

However, I believe that the x64 asm code in the question is correct. The MOV instructions are all fine with 32 bit operands. And the DIV is also as described in the comment in the asm code. The Intel documentation for DIV r/m32 says:

Unsigned divide EDX:EAX by r/m32, with result stored in EAX ← Quotient, EDX ← Remainder.

And let's take a look at what the Delphi compiler does with this code:

var
  a, b, c, d: Cardinal;
....
a := 666;
b := 42;
c := a div b;
d := a mod b;

The code that is produced is:

    
Project39.dpr.14: a := 666;
0000000000423A68 C7450C9A020000   mov [rbp+$0c],$0000029a
Project39.dpr.15: b := 42;
0000000000423A6F C745082A000000   mov [rbp+$08],$0000002a
Project39.dpr.16: c := a div b;
0000000000423A76 8B450C           mov eax,[rbp+$0c]
0000000000423A79 33D2             xor edx,edx
0000000000423A7B F77508           div dword ptr [rbp+$08]
0000000000423A7E 894504           mov [rbp+$04],eax
Project39.dpr.17: d := a mod b;
0000000000423A81 8B450C           mov eax,[rbp+$0c]
0000000000423A84 33D2             xor edx,edx
0000000000423A86 F77508           div dword ptr [rbp+$08]
0000000000423A89 895500           mov [rbp+$00],edx

I don't have any expectation that the 32 bit divide will be more efficient than a 64 bit divide, but that doesn't really matter. It seems more natural to perform the 32 bit operation with 32 bit operands.

answered Oct 21 '22 07:10

David Heffernan

Related questions
                            
                                Callback from .Net COM dll to Delphi client in registration-free (side-by-side) COM
                            
                                TComboBox - how to adjust drop down list height while it is dropped down?
                            
                                Delphi interface reference count mechanism
                            
                                What's the idiomatic way to do async socket programming in Delphi?
                            
                                How to show a window that acts like a popup menu?
                            
                                Getting DUnit Test Coverage stats using FinalBuilder and AQTime
                            
                                The control 'xxx' has no parent window
                            
                                How to debug a DLL called from a DLL in Delphi
                            
                                How do I send post and header data with Chromium?
                            
                                Why doesn't the optimizer eliminate High in a loop?
                            
                                Why does a call to GetDIBits fail on Win64?
                            
                                Delphi - how to check for any windows user if it is an administrator?
                            
                                Nested generic record
                            
                                When should I use VirtualTree's BeginSynch vs BeginUpdate?
                            
                                Setting EXE icon in Delphi 7
                            
                                If terminating a hung thread is a good idea, how do I do it safely?
                            
                                How can I read barcodes without having the user focus a text box first?
                            
                                Delphi: Aero-Glass Tabs [closed]
                            
                                How I can add a custom Color to an TColorBox using the OnGetColors event?
                            
                                Delphi : how to check if a file exists (path over 255 characters)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I implement an efficient 32 bit DivMod in 64 bit code

Tags:

assembly

x86-64

delphi

David Heffernan

People also ask

2 Answers

J...

David Heffernan

Recent Activity

Donate For Us