The FISTP instruction changes 0.75 to 1 (because of rounding)
I want 0.75 to turn into 0, not 1.
Is there an alternative to FIST/FISTP that truncates instead of rounds?
You truly have a plethora of options here:
If you're using SSE2 instructions anyway, then you can use the SSE2 instructions for converting a floating-point value to an integer value with truncation. Peter Cordes's answer discusses this approach. CVTTSD2SI
is the scalar version, and CVTTPD2DQ
is the packed/vector version.
If you're targeting x86-64, SSE2 will always be available, and this is what you should be using for all floating-point operations. The x87 FPU is completely obsolete on x86-64.
If you're targeting x86-32 processors prior to the Pentium 4 or Athlon 64, then SSE2 instructions will not be available. In that case, SSE instructions may still be available (SSE is supported by Pentium 3, Athlon XP, and later). SSE supports only single-precision floating-point operations, so if you don't need the precision, you can use CVTTSS2SI
(scalar) or CVTTPS2DQ
(packed/vector). Unfortunately, you often need the precision; see below for a better workaround.
If SSE3 instructions are available (Pentium 4 Prescott, certain Athlon 64s, and later), then you can use the FISTTP
instruction, which is like FISTP
, except that it always truncates, regardless of the current rounding mode. This is the solution that fuz's answer presents.
This is a very good solution if you are already using the x87 FPU, but is of limited applicability because if you're targeting chips that support SSE3, they necessarily support SSE2, and therefore you should be using SSE instructions to do all floating-point manipulation. The only exception is if you really need the extended 80-bit precision offered by the x87 FPU for your intermediate calculations (SSE2 is limited to 64-bit double-precision).
If you are stuck on legacy x86-32 processors and using the x87 FPU without SSE, you're still not out of options. There are a couple of fast bit-twiddling methods. These were not my original innovations—the code is scattered around the Internet various places, I just collated and tweaked them slightly, so I cannot take full credit nor can I cite a particular source. Here is one such source.
For single-precision floating-point values, the entire bit representation fits into a 32-bit register, so the implementation is straightforward (this assumes that the floating-point value to be truncated is at the top of the x87 FPU stack):
; Retrieve the bit representation of the original floating-point value.
push eax
fst DWORD PTR [esp]
mov eax, DWORD PTR [esp]
; Twiddle those raw bits.
and eax, 080000000H
xor eax, 0BEFFFFFFH
; Store those manipulated bits back in memory, since we can't load
; directly from a register to the x87 FPU stack.
mov DWORD PTR [esp], eax
; Add the modified value to the original value at the top of the stack.
fadd DWORD PTR [esp]
; Round the adjusted floating-point value to an integer.
; (Our bit manipulation ensures that this will always truncate,
; regardless of the current rounding mode.)
fistp DWORD PTR [esp]
; ... do something with the result in ESP
pop eax
An alternative implementation uses a static array of "adjustment" values, which we index into based on the "signedness" of the original floating-point value. This is basically what a naïve "truncate" function written in C would do, except that this does it branchlessly:
const uint32_t kSingleAdjustments[2] = { 0xBEFFFFFF, /* -0.49999997f */
0x3EFFFFFF /* +0.49999997f */ };
; Retrieve the bit representation of the floating-point value.
push eax
fst DWORD PTR [esp]
mov eax, DWORD PTR [esp]
; Isolate the sign bit.
shr eax, 31
; Use the sign bit as an index into the array of values to add the appropriate
; adjustment value to the original floating-point value at the top of the stack.
; (NOTE: This syntax is for MSVC's inline asm; translate as necessary.)
fadd DWORD PTR [kSingleAdjustments + (eax * TYPE kSingleAdjustments)]
; Round the adjusted floating-point value to an integer.
; (Our adjustment ensures that it will be truncated, regardless of rounding mode.)
fistp DWORD PTR [esp]
; ... do something with the result in ESP
pop eax
My benchmarks suggest that the second variant is faster on Intel processors, but slower on AMD (specifically, Athlon XP and Athlon 64). I ultimately settled on approach #2 for my library, especially since I re-use the "adjustment" values to implement other types of fast rounding.
Note that the final FISTP
instruction supports both m32
and m64
operands, so if you want to truncate to a 64-bit integer for greater precision, that is possible. Just remember to allocate twice as much space on the stack, and then use fistp QWORD PTR, [esp]
instead of fistp DWORD PTR, [esp]
.
I realize that this all looks very complicated, but this really is significantly faster than adjusting the rounding mode, doing the rounding, and setting the rounding mode back. I have benchmarked it extensively on a variety of processors, and in a variety of code paths, and never found it to be slower. But I use it in C code, where the compiler is required by the standard to emit code that restores the rounding mode. If you're writing assembly by hand, and you need truncation, just switch the FPU's rounding mode to "truncate" once and leave it at that.
There is a double-precision version of this bit-twiddling code, too. The key is realizing that the sign bit lies in the upper 32 bits of a 64-bit double, so you still only need a single 32-bit register.
However, the double-precision version is not bug-free! A floating-point value that is extremely close to a whole number will be rounded up to the nearest whole number, instead of being truncated (e.g., 4.99999977 is erroneously rounded to 5, instead of being truncated to 4). Someone smarter than me and with more time to play around with this may come up with a way to fix this, but I'm satisfied with the accuracy of this in most cases, especially given the massive speed improvements.
const uint64_t kDoubleAdjustments[2] = { 0xBFDFFFFF00000000,
0x3FDFFFFF00000000 };
sub esp, 8
fst QWORD PTR [esp]
mov eax, DWORD PTR [esp+4] ; we only need the upper 32 bits
shr eax, 31
fadd QWORD PTR [kDoubleAdjustments + (eax * TYPE kDoubleAdjustments)]
fistp DWORD PTR [esp]
; ... do something with the result in ESP
add esp, 8
The SSE3 instruction set also introduced the fisttp
instruction. It works like the fistp
instruction, which can store a floating-point number as a 32-bit integer (popping the stack in the process), except that it always truncates the value, regardless of the current rounding mode.
Here is an example of how to use that:
FLD QWORD PTR [esi] ; load 64 bit floating point number
FISTTP DWORD PTR [edi] ; truncate and store as 32 bit integer
or in AT&T-syntax:
fldl (%esi)
fisttpl (%edi)
If you do not have a processor that supports SSE3, you can reach similar results with the fistp
instruction after making sure the rounding mode is set to “truncate.”
sub esp,0x4 ; make space for the control word
fstcw WORD PTR [esp] ; store the FPU control word
fstcw WORD PTR [esp+0x2] ; store another copy
or WORD PTR [esp],0x0c00 ; set rounding mode to "truncate"
fldcw WORD PTR [esp] ; load updated control word
fld QWORD PTR [esi] ; load floating point number
fistp WORD PTR [edi] ; truncate to integer
fldcw WORD PTR [esp+0x2] ; restore control word
or in AT&T-syntax:
sub $4,%esp
fstcw (%esp)
fstcw 2(%esp)
orw $0x0c00,(%esp)
fldcw (%esp)
fldl (%esi)
fistp (%edi)
fldcw 2(%esp)
If your code is not going to run on an 80286 or older, you might want to use fnstcw
instead of fstcw
to save one byte per instruction at the expense of the code possibly not working on a real 8087.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With