I've heard that the x87 FPU works with 80-bit floats, so even if I want to calculate using 64-bit numbers, it would calculate it with 80-bit and then convert it. But which is fastest in Swift on x86-64, Double
or Float80
(when calculating arithmetic)?
A Double and Float are both used to represent decimal numbers, but they do so in slightly different ways. If you initialize a decimal number in Swift using as shown below, the Swift compiler will assume that you meant to create a Double: let val = 3.123 The reason for this is that Double is the more precise type when comparing it to Float.
If you initialize a decimal number in Swift using as shown below, the Swift compiler will assume that you meant to create a Double: The reason for this is that Double is the more precise type when comparing it to Float. A Float holds a total of 8 positions, or 32 bits. Since Double is more precise, it can hold more positions.
Or Are they same? Float64 is just a type alias to Double. The headers, found by hitting command + shift + o and searching for Float64, say:
To round a Double or Float down to the nearest whole number use the built-in function floor: Like round (_:), using multiplication and division to round down using floor to a specific decimal place: That’s it! By using round (_:), ceil (_:), and floor (_:) you can round Double and Float values to any number of decimal places in Swift.
While it's true that the x87 FPU operates internally at 80-bit "extended" precision (at least, by default; this is customizable, and in fact 32-bit builds following the macOS ABI set 64-bit internal precision), binaries targeting x86-64 no longer use x87 FPU instructions. All x86 chips that implement the 64-bit long mode extension also support SSE2 (in fact, this was required by the AMD64 specification), so a 64-bit binary can always assume SSE2 support. As such, this is what is used to implement floating-point operations, because it's much more efficient and easier to optimize around for a compiler.
Even 32-bit builds in the modern era assume SSE2 as a minimum, and certainly on the Macintosh platform, since SSE2 was introduced with the Pentium 4, which predated the Macintosh platform's switch to Intel x86 chips. All x86 chips ever used in an Apple machine support SSE2.
So no, you aren't going to see any performance improvement by using an 80-bit extended precision type. You weren't going to see any performance improvement from x87 instructions, even if they were generated by the compiler. And you certainly aren't going to see any performance improvement on x86-64, because SSE2 supports a maximum of 64-bit precision in hardware. Any 80-bit precision operations are going to have to be implemented in software, or force a smart compiler to emit x87 instructions, which means you don't benefit from any of the nice features and tangible performance improvements of SSE2.
Double
will almost always[1] be at least as fast on Float80
on modern Intel processors, in almost any language. There are some situations in which it will be significantly faster:
Double
uses less memory; it's possible for an algorithm's working set to fit in cache when using Double
, but fail to fit when using Float80
, causing significant performance hazards.
Double
can take advantage for FMA instructions (exposed in Swift as .add[ing]Product(x,y)
and the fma()
free function), which effectively doubles the attainable floating-point throughput on recent cores.
Double
can be auto-vectorized by the compiler. There are no vector instructions on Float80
. When possible, this can give you up to a 4x speedup.
Math functions like sin
, cos
, pow
, etc. are faster on Double
than they are on Float80
.
There are some other reasons to use Double
: it's portable to non-x86 hardware, whereas Float80
is not, and interoperability with C interfaces is easier with Double
than it is with Float80
. You should only use Float80
when necessary, and default to using Double
otherwise.
[1] There are a few niche cases where Float80
can be faster--if an algorithm repeatedly underflows in Double
, but remains in normal range in Float80
, for example. These are rare, and usually not worth worrying about; more commonly your algorithm will also underflow in Float80
, just do it a few iterations later.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With