Why are double preferred over float? [closed]

Tags:

In most of the code I see around, double is favourite against float, even when a high precision is not needed.

Since there are performance penalties when using double types (CPU/GPU/memory/bus/cache/...), what is the reason of this double overuse?

Example: in computational fluid dynamics all the software I worked with uses doubles. In this case a high precision is useless (because of the errors due to the approximations in the mathematical model), and there is a huge amount of data to be moved around, which could be cut in half using floats.

The fact that today's computers are powerful is meaningless, because they are used to solve more and more complex problems.

928

asked Apr 02 '14 17:04

Pietro

2 Answers

Among others:

The savings are hardly ever worth it (number-crunching is not typical).
Rounding errors accumulate, so better go to higher precision than needed from the start (experts may know it is precise enough anyway, and there are calculations which can be done exactly).
Common floating operations using the fpu internally often work on double or higher precision anyway.
C and C++ can implicitly convert from float to double, the other way needs an explicit cast.
Variadic and no-prototype functions always get double, not float. (second one is only in ancient C and actively discouraged)
You may commonly do an operation with more than needed precision, but seldom with less, so libraries generally favor higher precision too.

But in the end, YMMV: Measure, test, and decide for yourself and your specific situation.

BTW: There's even more for performance fanatics: Use the IEEE half precision type. Little hardware or compiler support for it exists, but it cuts your bandwidth requirements in half yet again.

179

answered Oct 18 '22 04:10

Deduplicator

double is, in some ways, the "natural" floating point type in the C language, which also influences C++. Consider that:

an unadorned, ordinary floating-point constant like 13.9 has type double. To make it float, we have to add an extra suffix f or F.
default argument promotion in C converts float function arguments^* to double: this takes place when no declaration exists for an argument, such as when a function is declared as variadic (e.g. printf) or no declaration exists (old style C, not permitted in C++).
The %f conversion specifier of printf takes a double argument, not float. There is no dedicated way to print float-s; a float argument default-promotes to double and so matches %f.

On modern hardware, float and double are usually mapped, respectively, to 32 bit and 64 bit IEEE 754 types. The hardware works with the 64 bit values "natively": the floating-point registers are 64 bits wide, and the operations are built around the more precise type (or internally may be even more precise than that). Since double is mapped to that type, it is the "natural" floating-point type.

The precision of float is poor for any serious numerical work, and the reduced range could be a problem also. The IEEE 32 bit type has only 23 bits of mantissa (8 bits are consumed by the exponent field and one bit for the sign). The float type is useful for saving storage in large arrays of floating-point values provided that the loss of precision and range isn't a problem in the given application. For example, 32 bit floating-point values are sometimes used in audio for representing samples.

It is true that the use of a 64 bit type over 32 bit type doubles the raw memory bandwidth. However, that only affects programs which with a large arrays of data, which are accessed in a pattern that shows poor locality. The superior precision of the 64 bit floating-point type trumps issues of optimization. Quality of numerical results is more important than shaving cycles off the running time, in accordance with the principle of "get it right first, then make it fast".

* Note, however, that there is no general automatic promotion from float expressions to double; the only promotion of that kind is integral promotion: char, short and bitfields going to int.

answered Oct 18 '22 04:10

Kaz

Related questions
                            
                                g++ template parameter error
                            
                                Why are inline constructors and destructors not a good idea in C++?
                            
                                How to delete an element from a vector while looping over it?
                            
                                C++: What is the difference between ostream and ostringstream?
                            
                                Any STL data structure like pair that gives three items(types) instead of two?
                            
                                cpp: usr/bin/ld: cannot find -l<nameOfTheLibrary>
                            
                                Instead of NULL, should I write `0x0` or `0`?
                            
                                Stray '\342' in C++ program
                            
                                What is the difference between type casting and type conversion in C++ or Java?
                            
                                Can I delete[] a pointer that points into an allocated array, but not to the start of it?
                            
                                How do I check if a template parameter is a power of two?
                            
                                gcc 4.3.3 compiler options enabled by default
                            
                                How to prevent a method from being overridden in derived class? [duplicate]
                            
                                How to know the size of a const array?
                            
                                How to convert Byte Array to hex string in visual c++?
                            
                                how to add a 1 second delay using Qtimer
                            
                                Are there any web frameworks for compiled languages like C++? [closed]
                            
                                Header files inclusion / Forward declaration
                            
                                Exception safety and make_unique
                            
                                Difference in opengl speed between Qt 4/5 and Opengl API

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why are double preferred over float? [closed]

Tags:

c++

performance

floating-point

double

Pietro

People also ask

2 Answers

Deduplicator

Kaz

Recent Activity

Donate For Us