Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia: BigFloat to byte array

Tags:

arrays

julia

I want to get the binary digits of a BigFloat number. I thought that the easiest way to do that is to turn the number to a byte array. In this question there is a way to do that for Float64, but when I try the same for BigFloat it tells me that write does not have a method for BigFloat.

How can I do that with Julia?

PS. I know that there exists a method to get the binary digits as string, however this would introduce a huge overhead and I want to avoid it.

like image 438
tst Avatar asked Oct 06 '17 11:10

tst


3 Answers

Serializing or representing BigFloats is tricky because: a. the BigFloat structure uses pointers, and b. the pointed to buffer is controlled by the GMP library which is an external library wrapped in Julia.

So as a first step, here are some functions to textually format a BigFloat in an accurate and easy to manipulate way:

# numbits(a) returns the number of significant digits in a BigFloat
# including digits both left and right of floating point.
# For example:
#     julia> numbits(BigFloat(3/8)
#     2
#
# This is the trickiest function since it looks slightly into
# the internal representation in GMP of BigInt data.

function numbits(a::BigFloat)
    n = a.prec
    for i=1:(a.prec>>count_ones(sizeof(Base.GMP.Limb)*8-1))
        tz = trailing_zeros(unsafe_load(a.d,i))
        n -= tz
        if tz < sizeof(Base.GMP.Limb)*8
            break
        end
    end
    return n
end

# mantissarep(a) returns a tuple with two elements. The first element is a BigInt
# holding all the significant bits of a BigFloat, and the second holds
# the exponent needed to return the floating point to its position.
# Thus, as an example, the following holds:
#     julia> a = BigFloat(1.1)
#     1.100000000000000088817841...
#
#     julia> mantissarep(a)
#     (2476979795053773, 51)
# 
#     julia> big"2476979795053773"/big"2"^51 == a
#     true
#

mantissarep(a::BigFloat) = (BigInt(a*BigFloat(2)^(numbits(a)-a.exp)),
                            numbits(a)-a.exp)

# bigfloattext(a) returns an easy textual representation of a BigFloat

function bigfloattext(bf::BigFloat)
    (a,b) = mantissarep(bf)
    return "$(sign(a)==-1 ? "-" : "")big\"$(dec(abs(a)))\"/big\"2\"^$(b)"
end

  

# bigfloatbintext(a) returns an easy binary textual representation a BigFloat

function bigfloatbintext(bf::BigFloat)
    (a,b) = mantissarep(bf)
    return "$(sign(a)==-1 ? "-" : "")big\"0b$(bin(abs(a)))\"/big\"2\"^0b$(bin(b))"
end

With these functions, one method of serializing a BigFloat is simply to write the string output of bigfloattext. This method is not very compact, but it will be easy to deserialize (the parse method for BigFloats could be used) and relatively clear. To make a more compact serialization, the bytes of mantissarep together with the precision field of the BigFloat could be written. This would make this answer a bit longer, so I will add if requested.

For example, writing:

julia> a = sin(BigFloat(π)/3)    # calculate some BigFloat
8.660254037844386467637231707529361834714026269051903140279034897259665084543988e-01

julia> println(bigfloattext(a))  # represent it as a string
big"100278890836790510567389408543623384672710501789331344711007167057270294106993"/big"2"^256

julia> # show representation is faithful
julia> big"100278890836790510567389408543623384672710501789331344711007167057270294106993"/big"2"^256 == a
true

Note that (in regard to other answer) writing the d field (=data field) of a BigFloat which is a Ptr{Limb} type is rather useless and could possibly lead to memory corruption.

UPDATE: Guided by the poster's comment below, here are another couple of functions to convert a BigFloat into a byte representation:

function BigFloat2Bytes(bf::BigFloat)
    bf2 = bf/big"2"^exponent(bf)-1.0
    bvec = Vector{UInt8}()
    push!(bvec,0x01)
    while bf2 != 0.0
        bf2 *= 256
        b = trunc(Int, bf2)
        push!(bvec, UInt8(b))
        bf2 -= b
    end
    return (bvec, exponent(bf))
end

function Bytes2BigFloat(bvec, e)
    bf = zero(BigInt)
    for b in bvec
        bf = bf*256 + b
    end
    return BigFloat(bf*BigFloat(2)^(e-8*length(bvec)+8))
end

With these we have:

julia> a = BigFloat(123.2323)
1.23232299999999995065991242881...e+02

julia> BigFloat2Bytes(a)
(UInt8[0x01, 0xec, 0xed, 0xe0, 0x0d, 0x1b, 0x71, 0x70], 6)

julia> Bytes2BigFloat(BigFloat2Bytes(a)...)
1.23232299999999995065991242881...e+02

julia> Bytes2BigFloat(BigFloat2Bytes(a)...) == a
true
like image 75
Dan Getz Avatar answered Oct 15 '22 23:10

Dan Getz


I was trying to use other approach. I am not sure if I am doing it right! But maybe it is interesting. :)

Smallest overhead we could probably get by using library which is responsible for BigFloat (MPFR). So I was trying to call mpfr_sprintf (http://www.mpfr.org/mpfr-current/mpfr.html#Formatted-Output-Functions) to get hexadecimal and binary representation of BigFloat:

julia> begin 
   a = " "^1000  # this could be not enough!
   f = BigFloat(123.2323)
   ccall(
      (:mpfr_sprintf, :libmpfr), 
      Int64, 
      (Cstring, Cstring, Ref{BigFloat}...),
      a, 
      "%Rb\n%Ra", 
      Ref(f), Ref(f))
   res_bin = a[1:search(a, '\n')-1]
   res_hex = a[search(a, '\n')+1:search(a, '\0')-1]
   println(res_bin)
   println(res_hex)
   println(f == BigFloat(res_hex))
   end
1.1110110011101101111000000000110100011011011100010111p+6
0x7.b3b780346dc5cp+4
true

Did I miss better Julia API (not ccall) to mpfr_sprintf (and other functions) somewhere around BigFloat?

edit: Some explanation for tst.

Sorry I think my answer is not primary for you because it is about conversion to string.

BigFloat is based on MPFR library. It has mpfr_sprintf function which could format output to string. "%Ra" is for example format string for hexadecimal output for mpfr_t type (Ref(BigFloat) is reference of this type). Julia is strong with calling c-libraries. So you could call mpfr_sprintf. I showed both binary and hexadecimal output. Output is null ended string so I needed to parse it from result. Content of res_hex value -> "0x7.b3b780346dc5cp+4" which you could use to REPL or code - it is supported hexadecimal floating point format:

julia> 0x7.b3b780346dc5cp+4
123.2323

julia> typeof(0x7.b3b780346dc5cp+4)
Float64 

p+4 in it means "binary exponent" (BTW. 0x7b == 123 (from BigFloat(123.2323))

And BTW you could rewrite results from Dan's answer to hexadecimal floating point representation ->

# [0x01, 0xec, 0xed, 0xe0, 0x0d, 0x1b, 0x71, 0x70], 6) ->
julia> 0x1.ecede00d1b7170p+6
123.2323

And BTW python has hex function to return hexadecimal representation -> https://pythonhosted.org/bigfloat/reference/index.html#bigfloat.BigFloat.hex

like image 2
Liso Avatar answered Oct 16 '22 00:10

Liso


Dan's (edited) answer looks more robust than mine, but is still a little awkward for serialization.

Just for reference, here's a quick-and-dirty answer that I use to go from BigFloat to Vector{UInt8}

function unsafe_serialize(p::BigFloat)
    vcat(
        reinterpret(UInt8, [p.prec]),
        reinterpret(UInt8, [p.sign]),
        reinterpret(UInt8, [p.exp]),
        reinterpret(UInt8, unsafe_wrap(Array, p.d, (4,)))
    )
end

and back

function unsafe_deserialize(::Type{BigFloat}, p::Array{UInt8})
    Base.MPFR._BigFloat(
        reinterpret(Int64, p[1:8])[1],
        reinterpret(Int32, p[9:12])[1],
        reinterpret(Int64, p[13:20])[1],
        String(p[21:52])
    )
end

The reason these are dirty is because I've hard coded the numbers and types. In the first function, that 4 assumes your machine uses Int64 for the C long type, and that you've stuck with the default precision (or maybe slightly reduced it). In the second, each of those Int* types, and the corresponding ranges, are machine-dependent, and that final range depends on the precision. These could probably be improved by actually using the correct sizes and types, but I'm too lazy.

I know these work with the default precision on my machine, and will likely work for most installations, but your mileage may vary (right into a segfault) if you have some non-standard machine.

like image 1
Mike Avatar answered Oct 15 '22 23:10

Mike