C++17 to add hexadecimal floating constant (floating point literal). Why? How about a couple of examples showing the benefits.

Floating point numbers are stored in x86/x64 processors in base 2, not base 10: https://en.wikipedia.org/wiki/Double-precision_floating-point_format . Because of that many decimal floating point numbers cannot be represented exactly, e.g decimal 0.1 could be represented as something like 0.1000000000000003 or 0.0999999999999997 - whatever has base 2 representation close enough to decimal 0.1 . Because of that inexactness, e.g. printing in decimal and then parsing of a floating-point number may result in a slightly different number than the one stored in memory binarily before printing. For some application emergence of such errors is unacceptable: they want to parse into exactly the same binary floating-point number as the one which was before printing (e.g. one application exports floating-point data and another imports). For that, one could export and import doubles in hexadecimal format. Because 16 is a power of 2, binary floating-point numbers can be represented exactly in hexadecimal format. <code>printf</code> and <code>scanf</code> have been extended with <code>%a</code> format specifier which allows to print and parse hexadecimal floating point numbers. Though MSVC++ does not support <code>%a</code> format specifier for <code>scanf</code> yet: <blockquote> The a and A specifiers (see printf Type Field Characters) are not available with scanf. </blockquote> To print a <code>double</code> in full precision with hexadecimal format one should specify printing of 13 hexadecimal digits after point, which correspond to 13*4=52 bits: <pre class="prettyprint"><code>double x = 0.1; printf("%.13a", x); </code></pre> See more details on hexadecimal floating point with code and examples (note that at least for MSVC++ 2013 simple specification of <code>%a</code> in <code>printf</code> prints 6 hexadecimal digits after point, not 13 - this is stated in the end of the article). Specifically for constants, as asked in the question, hexadecimal constants may be convenient for testing the application on exact hard-coded floating-point inputs. E.g. your bug may be reproducible for 0.1000000000000003, but not for 0.0999999999999997, so you need hexadecimal hardcoded value to specify the representation of interest for decimal 0.1 .

Why hexadecimal floating constants in C++17?

2 Answers

Floating point numbers are stored in x86/x64 processors in base 2, not base 10: https://en.wikipedia.org/wiki/Double-precision_floating-point_format . Because of that many decimal floating point numbers cannot be represented exactly, e.g decimal 0.1 could be represented as something like 0.1000000000000003 or 0.0999999999999997 - whatever has base 2 representation close enough to decimal 0.1 . Because of that inexactness, e.g. printing in decimal and then parsing of a floating-point number may result in a slightly different number than the one stored in memory binarily before printing.

For some application emergence of such errors is unacceptable: they want to parse into exactly the same binary floating-point number as the one which was before printing (e.g. one application exports floating-point data and another imports). For that, one could export and import doubles in hexadecimal format. Because 16 is a power of 2, binary floating-point numbers can be represented exactly in hexadecimal format.

printf and scanf have been extended with %a format specifier which allows to print and parse hexadecimal floating point numbers. Though MSVC++ does not support %a format specifier for scanf yet:

The a and A specifiers (see printf Type Field Characters) are not available with scanf.

To print a double in full precision with hexadecimal format one should specify printing of 13 hexadecimal digits after point, which correspond to 13*4=52 bits:

double x = 0.1;
printf("%.13a", x);

See more details on hexadecimal floating point with code and examples (note that at least for MSVC++ 2013 simple specification of %a in printf prints 6 hexadecimal digits after point, not 13 - this is stated in the end of the article).

Specifically for constants, as asked in the question, hexadecimal constants may be convenient for testing the application on exact hard-coded floating-point inputs. E.g. your bug may be reproducible for 0.1000000000000003, but not for 0.0999999999999997, so you need hexadecimal hardcoded value to specify the representation of interest for decimal 0.1 .

answered Oct 19 '22 20:10

Serge Rogatch

The main 2 reasons to use hex floats over decimals are accuracy and speed.

The algorithms for accurately converting between decimal constants and the underlying binary format of floating point numbers are surprisingly complicated, and even nowadays conversion errors still occasionally arise.

Converting between hexadecimal and binary is a much simpler endeavour, and guaranteed to be exact. An example use case is when it is critical that you use a specific floating point number, and not one either side (e.g. for implementations of special functions such as exp). This simplicity also makes the conversion much faster (it doesn't require any intermediate "bignum" arithmetic): in some cases I've seen 3x speed up for read/write operations for hex float vs decimals.

answered Oct 19 '22 19:10

Simon Byrne

Related questions
                            
                                How to store template parameters in something like a struct?
                            
                                Subtract Signed integer from Unsigned integer [duplicate]
                            
                                Handle a char array returned from a function in C
                            
                                Does Deleting a Dynamically Allocated Vector Clear It's Contents
                            
                                C++ why does SFINAE fail with only a class template parameter?
                            
                                Compiling C++ threads
                            
                                What can std::remove_extent be used for?
                            
                                How to resize splitter widgets programmatically in Qt?
                            
                                Catching libc error messages, redirecting from /dev/tty [duplicate]
                            
                                std::tuple for non-copyable and non-movable object
                            
                                Adding Gaussian noise
                            
                                Is it safe in C++ to subtract from container.end()?
                            
                                Set a QML category for console.log
                            
                                Do I still have to disconnect a lambda from a signal in Qt5.5?
                            
                                Sort vector by even and odd indices. c++
                            
                                Metafunction to convert a type to an integer and vice-versa
                            
                                Can incrementing a pointer without dereferencing still segfault or have other (un)defined nastiness?
                            
                                How to Check the Version of my gcc?
                            
                                Alpha rendering difference between OpenGL and WebGL
                            
                                Why is Vulkan's VkBool32 implemented as an unsigned int?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why hexadecimal floating constants in C++17?

Tags:

c++

floating-point

constants

c++17

CW Holeman II

People also ask

2 Answers

Serge Rogatch

Simon Byrne

Recent Activity

Donate For Us