Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert float to bigint (aka portable way to get binary exponent & mantissa)

In C++, I have a bigint class that can hold an integer of arbitrary size.

I'd like to convert large float or double numbers to bigint. I have a working method, but it's a bit of a hack. I used IEEE 754 number specification to get the binary sign, mantissa and exponent of the input number.

Here is the code (Sign is ignored here, that's not important):

 float input = 77e12;
 bigint result;

 // extract sign, exponent and mantissa, 
 // according to IEEE 754 single precision number format
 unsigned int *raw = reinterpret_cast<unsigned int *>(&input); 
 unsigned int sign = *raw >> 31;
 unsigned int exponent = (*raw >> 23) & 0xFF;
 unsigned int mantissa = *raw & 0x7FFFFF;

 // the 24th bit is always 1.
 result = mantissa + 0x800000;

 // use the binary exponent to shift the result left or right
 int shift = (23 - exponent + 127);
 if (shift > 0) result >>= shift; else result <<= -shift;

 cout << input << " " << result << endl;

It works, but it's rather ugly, and I don't know how portable it is. Is there a better way to do this? Is there a less ugly, portable way to extract the binary mantissa and exponent from a float or double?


Thanks for the answers. For posterity, here is a solution using frexp. It's less efficient because of the loop, but it works for float and double alike, doesn't use reinterpret_cast or depend on any knowledge of floating point number representations.

float input = 77e12;
bigint result;

int exponent;
double fraction = frexp (input, &exponent);
result = 0;
exponent--;
for (; exponent > 0; --exponent)
{
    fraction *= 2;
    if (fraction >= 1)
    {
        result += 1;
        fraction -= 1;
    }
    result <<= 1;
}   
like image 528
amarillion Avatar asked Feb 28 '23 17:02

amarillion


2 Answers

Can't you normally extract the values using frexp(), frexpf(), frexpl()?

like image 193
Kornel Kisielewicz Avatar answered Apr 08 '23 04:04

Kornel Kisielewicz


I like your solution! It got me on the right track.

I'd recommend one thing though - why not get a bunch of bits all at once and almost always eliminate any looping? I implemented a float-to-bigint function like this:

template<typename F>
explicit inline bigint(F f, typename std::enable_if<(std::is_floating_point<F>::value)>::type* enable = nullptr) {
    int exp;
    F fraction = frexp(fabs(f),&exp);
    F chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
    *this = ulong(chunk); // will never overflow; frexp() is guaranteed < 1
    exp -= ulong_bit_count;
    while (sizeof(F) > sizeof(ulong) && (fraction -= chunk)) // this is very unlikely
    {
        chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
        *this <<= ulong_bit_count;
        (*this).data[0] = ulong(chunk);
        exp -= ulong_bit_count;
    }
    *this <<= exp;
    sign = f < 0;
}

(By the way, I don't know of an easy way to put in floating point power-of-two constants, so I defined float_pow_2 as follows):

template<typename F, unsigned Exp, bool Overflow = (Exp >= sizeof(unsigned))>
struct float_pow_2 {
    static constexpr F value = 1u << Exp;
};
template<typename F, unsigned Exp>
struct float_pow_2<F,Exp,true> {
    static constexpr F half = float_pow_2<F,Exp/2>::value;
    static constexpr F value = half * half * (Exp & 1 ? 2 : 1);
};
like image 45
Chris Avatar answered Apr 08 '23 06:04

Chris