In C++, I have a bigint class that can hold an integer of arbitrary size.
I'd like to convert large float or double numbers to bigint. I have a working method, but it's a bit of a hack. I used IEEE 754 number specification to get the binary sign, mantissa and exponent of the input number.
Here is the code (Sign is ignored here, that's not important):
float input = 77e12;
bigint result;
// extract sign, exponent and mantissa,
// according to IEEE 754 single precision number format
unsigned int *raw = reinterpret_cast<unsigned int *>(&input);
unsigned int sign = *raw >> 31;
unsigned int exponent = (*raw >> 23) & 0xFF;
unsigned int mantissa = *raw & 0x7FFFFF;
// the 24th bit is always 1.
result = mantissa + 0x800000;
// use the binary exponent to shift the result left or right
int shift = (23 - exponent + 127);
if (shift > 0) result >>= shift; else result <<= -shift;
cout << input << " " << result << endl;
It works, but it's rather ugly, and I don't know how portable it is. Is there a better way to do this? Is there a less ugly, portable way to extract the binary mantissa and exponent from a float or double?
Thanks for the answers. For posterity, here is a solution using frexp. It's less efficient because of the loop, but it works for float and double alike, doesn't use reinterpret_cast or depend on any knowledge of floating point number representations.
float input = 77e12;
bigint result;
int exponent;
double fraction = frexp (input, &exponent);
result = 0;
exponent--;
for (; exponent > 0; --exponent)
{
fraction *= 2;
if (fraction >= 1)
{
result += 1;
fraction -= 1;
}
result <<= 1;
}
Can't you normally extract the values using frexp(), frexpf(), frexpl()?
I like your solution! It got me on the right track.
I'd recommend one thing though - why not get a bunch of bits all at once and almost always eliminate any looping? I implemented a float-to-bigint function like this:
template<typename F>
explicit inline bigint(F f, typename std::enable_if<(std::is_floating_point<F>::value)>::type* enable = nullptr) {
int exp;
F fraction = frexp(fabs(f),&exp);
F chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
*this = ulong(chunk); // will never overflow; frexp() is guaranteed < 1
exp -= ulong_bit_count;
while (sizeof(F) > sizeof(ulong) && (fraction -= chunk)) // this is very unlikely
{
chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
*this <<= ulong_bit_count;
(*this).data[0] = ulong(chunk);
exp -= ulong_bit_count;
}
*this <<= exp;
sign = f < 0;
}
(By the way, I don't know of an easy way to put in floating point power-of-two constants, so I defined float_pow_2 as follows):
template<typename F, unsigned Exp, bool Overflow = (Exp >= sizeof(unsigned))>
struct float_pow_2 {
static constexpr F value = 1u << Exp;
};
template<typename F, unsigned Exp>
struct float_pow_2<F,Exp,true> {
static constexpr F half = float_pow_2<F,Exp/2>::value;
static constexpr F value = half * half * (Exp & 1 ? 2 : 1);
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With