Lua - packing IEEE754 single-precision floating-point numbers

Tags:

I want to make a function in pure Lua that generates a fraction (23 bits), an exponent (8 bits), and a sign (1 bit) from a number, so that the number is approximately equal to math.ldexp(fraction, exponent - 127) * (sign == 1 and -1 or 1), and then packs the generated values into 32 bits.

A certain function in the math library caught my attention:

The frexp function breaks down the floating-point value (v) into a mantissa (m) and an exponent (n), such that the absolute value of m is greater than or equal to 0.5 and less than 1.0, and v = m * 2^n.

Note that math.ldexp is the inverse operation.

However, I can't think of any way to pack non-integer numbers properly. As the the mantissa returned by this function is not an integer, I'm not sure if I can use it.

Is there any efficient way to do something similar to math.frexp() which returns an integer as the mantissa? Or is there perhaps a better way to pack numbers in the IEEE754 single-precision floating-point format in Lua?

Thank you in advance.

Edit

I hereby present the (hopefully) final version of the functions I made:

function PackIEEE754(number)
    if number == 0 then
        return string.char(0x00, 0x00, 0x00, 0x00)
    elseif number ~= number then
        return string.char(0xFF, 0xFF, 0xFF, 0xFF)
    else
        local sign = 0x00
        if number < 0 then
            sign = 0x80
            number = -number
        end
        local mantissa, exponent = math.frexp(number)
        exponent = exponent + 0x7F
        if exponent <= 0 then
            mantissa = math.ldexp(mantissa, exponent - 1)
            exponent = 0
        elseif exponent > 0 then
            if exponent >= 0xFF then
                return string.char(sign + 0x7F, 0x80, 0x00, 0x00)
            elseif exponent == 1 then
                exponent = 0
            else
                mantissa = mantissa * 2 - 1
                exponent = exponent - 1
            end
        end
        mantissa = math.floor(math.ldexp(mantissa, 23) + 0.5)
        return string.char(
                sign + math.floor(exponent / 2),
                (exponent % 2) * 0x80 + math.floor(mantissa / 0x10000),
                math.floor(mantissa / 0x100) % 0x100,
                mantissa % 0x100)
    end
end
function UnpackIEEE754(packed)
    local b1, b2, b3, b4 = string.byte(packed, 1, 4)
    local exponent = (b1 % 0x80) * 0x02 + math.floor(b2 / 0x80)
    local mantissa = math.ldexp(((b2 % 0x80) * 0x100 + b3) * 0x100 + b4, -23)
    if exponent == 0xFF then
        if mantissa > 0 then
            return 0 / 0
        else
            mantissa = math.huge
            exponent = 0x7F
        end
    elseif exponent > 0 then
        mantissa = mantissa + 1
    else
        exponent = exponent + 1
    end
    if b1 >= 0x80 then
        mantissa = -mantissa
    end
    return math.ldexp(mantissa, exponent - 0x7F)
end

I improved the way to utilise the implicit bit and added proper support for special values such as NaN and infinity. I based the formatting on that of the script catwell linked to.

I thank both of you for your great advice.

600

asked Jan 19 '13 17:01

RPFeltz

1 Answers

Multiply the significand you get from math.frexp() by 2^24, and subtract 24 from the exponent to compensate. Now the significand is an integer. Note that the significand is 24 bits, not 23 (you need to account for the implicit bit in the IEEE-754 encoding).

answered Oct 08 '22 02:10

Stephen Canon

Related questions
                            
                                How to read in one character at a time from a file in python?
                            
                                Determine if a string is an integer or a float in ANSI C
                            
                                Parse formatted Money String into Number [duplicate]
                            
                                Is 'const' double copying + comparison safe?
                            
                                Are floating point operations resulting in infinity undefined behavior for IEC 559/IEEE 754 floating-point types
                            
                                Will this C++ convert PDP-11 floats to IEEE?
                            
                                What is the best way to save tensor value to file as binary format?
                            
                                Pandas to_dict unwantedly modifying float numbers
                            
                                How to print a C++ double with the correct number of significant decimal digits?
                            
                                `std::sin` is wrong in the last bit
                            
                                Optimize for fast multiplication but slow addition: FMA and doubledouble
                            
                                Weird python behaviour on machine with ARM CPU
                            
                                Implementation of 32-bit floats or 64-bit longs in JavaScript?
                            
                                Why does DateTime to Unix time use a double instead of an integer?
                            
                                C++, How floating-point arithmetic operations get optimized?
                            
                                Full precision output of floating point types in SQL Server Management Studio
                            
                                acos(1) returns NaN for some values, not others
                            
                                Java raytracing float vs double
                            
                                Displaying a Float to a Textbox type "number"
                            
                                How to deal with excess precision in floating-point computations?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Lua - packing IEEE754 single-precision floating-point numbers

Tags:

floating-point

ieee-754

lua

pack

RPFeltz

People also ask

1 Answers

Stephen Canon

Recent Activity

Donate For Us