Is there a C++ Standards compliant way to determining the structure of a 'float', 'double', and 'long double' at compile-time ( or run-time, as an alternative )?
If I assume std::numeric_limits< T >::is_iec559 == true
and std::numeric_limits< T >::radix == 2
, I suspect the is possible by the following rules:
with the following expressions vaguely like:
size_t num_significand_bits = std::numeric_limits< T >::digits;
size_t num_exponent_bits = log2( 2 * std::numeric_limits< T >::max_exponent );
size_t num_sign_bits = 1u;
except I know
std::numeric_limits< T >::digits
includes the "integer bit", whether or not the format actually explicitly represents it, so I don't know how to programmatically detect and adjust for this.std::numeric_limits< T >::max_exponent
is always 2^(num_exponent_bits)/2
.Background: I'm trying to overcome two issues portably:
In short, no. If std::numeric_limits<T>::is_iec559
, then you
know the format of T
, more or less: you still have to
determine the byte order. For anything else, all bets are off.
(The other formats I know that are still being used aren't even
base 2: IBM mainframes use base 16, for example.) The
"standard" arrangement of an IEC floating point has the sign on
the high order bit, then the exponent, and the mantissa on the
low order bits; if you can successfully view it as an
uint64_t
, for example (via memcpy
, reinterpret_cast
or
union
—`memcpy is guaranteed to work, but is less
efficient than the other two), then:
for double
:
uint64_t tmp;
memcpy( &tmp, &theDouble, sizeof( double ) );
bool isNeg = (tmp & 0x8000000000000000) != 0;
int exp = (int)( (tmp & 0x7FF0000000000000) >> 52 ) - 1022 - 53;
long mant = (tmp & 0x000FFFFFFFFFFFFF) | 0x0010000000000000;
for `float:
uint32_t tmp;
memcpy( &tmp, &theFloat, sizeof( float ) );
bool isNeg = (tmp & 0x80000000) != 0;
int exp = (int)( (tmp & 0x7F800000) >> 23 ) - 126 - 24 );
long mant = (tmp & 0x007FFFFF) | 0x00800000;
With regards to long double
, it's worse, because different
compilers treat it differently, even on the same machine.
Nominally, it's ten bytes, but for alignment reasons, it may in
fact be 12 or 16. Or just a synonym for double
. If it's
more than 10 bytes, I think you can count on it being packed
into the first 10 bytes, so that &myLongDouble
gives the
address of the 10 byte value. But generally speaking, I'd avoid
long double
.
I would say that the only portable way is to store the number as a string. This is not relying on "interpreting bit patterns"
Even if you know how many bits something is, doesn't mean that it has the same representation - the exponent zero-based or biased. Is there an invisible 1 at the front of the mantissa? The same applies to all of the other parts of the number. And it gets even worse for BCD encoded or "hexadecimal" floats - these are available in some architectures...
If you are worried about uninitialized bits in a structure (class, array, etc), then use memset to set the entire structure to zero [or some other known value].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With