We have two embedded projects: One of them is using the cosmic compiler and the other one is using GCC. Both abide by ISO/IEC 9899:1990.
When we initialize a float with the literal 14.8f
, it gets translated to the binary representation of 0x416CCCCC
on the cosmic compiler and 0x416CCCCD
by GCC.
The IEC standard at chapter 6.3.1.4, item 2, Floating types states:
If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower value, chosen in an implementation-defined manner.
as we are using these numbers as threshold, this obviously makes a difference.
The cosmic compiler states that it uses a round down implementation.
As GCC is quite more complex I was wondering if it has a compiler flag that allows choosing of the behavior at compile time. So far I have only found that you can choose FE_DOWNWARD
, but that is related to run-time rather than compile-time.
Does anyone have a clue of such a flag for compile-time conversion?
2) Floating-Point Literals These are used to represent and store real numbers. The real number has an integer part, real part, fractional part, and exponential part. The floating-point literals can be stored either in decimal form or exponential form.
A floating-point literal has an integer part, a decimal point, a fractional part, and an exponent part. You can represent floating point literals either in decimal form or exponential form.
Just for reference's sake, the relevant chapter in GCC's manual states:
How the nearest representable value or the larger or smaller representable value immediately adjacent to the nearest representable value is chosen for certain floating constants (C90 6.1.3.1, C99 and C11 6.4.4.2).
C99 Annex F is followed.
And in my draft C99 standard, Annex F says:
F.7.2 Translation
During translation the IEC 60559 default modes are in effect:
— The rounding direction mode is rounding to nearest.
— The rounding precision mode (if supported) is set so that results are not shortened.
— Trapping or stopping (if supported) is disabled on all floating-point exceptions
So that seem to clearly state that
Using the hexadecimal syntax to get the exact desired float
seems like the proper solution here, and (I guess) the reason that syntax exists.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With