The following functions do not compile with the 64 bits Delphi XE2 compiler. (The errors all relate to the fld
instructions.)
[dcc64 Error] Project1.dpr(12): E2116 Invalid combination of opcode and operands
[dcc64 Error] Project1.dpr(13): E2116 Invalid combination of opcode and operands
[dcc64 Error] Project1.dpr(20): E2116 Invalid combination of opcode and operands
Line 12 & 13:
fld Y
fld X
Line 20:
fld X
Unfortunately I have no assembly skills and I am using this third party code which I need to port to 64 bits. Can you help me in making it work on both 32 bits and 64 bits?
function PartArcTan(Y, X: Extended): Extended;
asm
fld Y // st(0) = Y
fld X // st(0) = X
fpatan // st(0) = ArcTan(Y, X)
fwait
end;
function ArcSin(X: Extended): Extended; // -1 <= X <= 1
asm
fld X // st(0) = X
fld st(0) // st(1) = X
fmul st(0), st(0) // st(0) = Sqr(X)
fld1 // st(0) = 1
fsubrp st(1), st(0) // st(0) = 1 - Sqr(X)
fsqrt // st(0) = Sqrt(1 - Sqr(X))
fpatan // st(0) = ArcTan(X, Sqrt(1 - X*X))
fwait
end;
The main problem with this code, for porting to x64, is that it uses the wrong floating point unit. On x64 floating point is done on the SSE unit.
Yes, the x87 unit is still there, but it is slow in comparison. Another problem is that the x64 ABI assumes that you will use the SSE unit. Parameters arrive in SSE registers. Floating point values are returned in an SSE register. It's pointless (not to mention rather hard work and time consuming) to transfer values between SSE and x87 units. What's more, floating point control, exception masks, are initialised for the SSE unit, but are you sure that they will be correctly set for the SSE unit.
So, in view of all this, I strongly advise you to make sure that all your floating point code is executed on the SSE unit under x64. I think that the only time that a case could be made for using the x87 register is for an algorithm that requires the 10 byte extended type that is supported on x87 but not SSE. That is not the case here.
Now, porting to the SSE unit is not as simple as translating the opcodes to SSE equivalents. That's because the SSE floating unit has much less capability built-in. For instance, there are no trigonometric functions included in the SSE opcodes.
So, the right way to deal with this is to switch to using Pascal code. These functions can be replaced by Math.ArcTan2
and Math.ArcSin
respectively.
To elaborate on this, let's look at what is involved in doing the calculation on the x87 unit, under x64. The code for ArcSin
goes like this:
function ArcSin(X: Double): Double;
// to be 100% clear, do **not** use this code
asm
movq [rsp-8], xmm0 // X arrives in xmm0, move it to stack memory
fld qword ptr [rsp-8] // now load X into the x87 unit
fld st(0) // calculation code exactly as before
fmul st(0), st(0)
fld1
fsubrp st(1), st(0)
fsqrt
fpatan
fwait
fstp qword ptr [rsp-8] // but now we need to move the return value
movq xmm0, [rsp-8] // back into xmm0, again via the stack
end;
Points to note:
So, perhaps this can serve as a warning to future visitors who wish to use the x87 to perform floating point arithmetic under x64.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With