Converting a floating point to its corresponding bit-segments

Question

Given a Ruby Float value, e.g.,

f = 12.125

I'd like to wind up a 3-element array containing the floating-point number's sign (1 bit), exponent (11 bits), and fraction (52 bits). (Ruby's floats are the IEEE 754 double-precision 64-bit representation.)

What's the best way to do that? Bit-level manipulation doesn't seem to be Ruby's strong point.

Note that I want the bits, not the numerical values they correspond to. For instance, getting [0, -127, 1] for the floating-point value of 1.0 is not what I'm after -- I want the actual bits in string form or an equivalent representation, like ["0", "0ff", "000 0000 0000"].

Matt · Accepted Answer

The bit data can be exposed via Arrays pack as Float doesn't provide functions internally.

str = [12.125].pack('D').bytes.reverse.map{|n| "%08b" %n }.join
=> "0100000000101000010000000000000000000000000000000000000000000000"

[ str[0], str[1..11], str[12..63] ]
=> ["0", "10000000010", "1000010000000000000000000000000000000000000000000000"]

This is a bit 'around about the houses' to pull it out from a string representation. I'm sure there is a more efficient way to pull the data from the original bytes...

Edit The bit level manipulation tweaked my interest so I had a poke around. To use the operations in Ruby you need to have an Integer so the float requires some more unpacking to convert into a 64 bit int. The big endian/ieee754 documented representation is fairly trivial. The little endian representation I'm not so sure about. It's a little odd, as you are not on complete byte boundaries with an 11 bit exponent and 52 bit mantissa. It's becomes fiddly to pull the bits out and swap them about to get what resembles little endian, and not sure if it's right as I haven't seen any reference to the layout. So the 64 bit value is little endian, I'm not too sure how that applies to the components of the 64bit value until you store them somewhere else, like a 16bit int for the mantissa.

As an example for an 11 bit value from little > big, The kind of thing I was doing was to shift the most significant byte left 3 to the front, then OR with the least significant 3 bits.

v = 0x4F2
((v & 0xFF) << 3) | ( v >> 8 ))

Here it is anyway, hopefully its of some use.

class Float
  Float::LITTLE_ENDIAN = [1.0].pack("E") == [1.0].pack("D")

  # Returns a sign, exponent and mantissa as integers
  def ieee745_binary64
    # Build a big end int representation so we can use bit operations
    tb = [self].pack('D').unpack('Q>').first

    # Check what we are
    if Float::LITTLE_ENDIAN
      ieee745_binary64_little_endian tb
    else
      ieee745_binary64_big_endian tb
    end
  end

  # Force a little end calc
  def ieee745_binary64_little
    ieee745_binary64_little_endian [self].pack('E').unpack('Q>').first
  end

  # Force a big end calc
  def ieee745_binary64_big
    ieee745_binary64_big_endian [self].pack('G').unpack('Q>').first
  end

  # Little
  def ieee745_binary64_little_endian big_end_int
    #puts "big #{big_end_int.to_s(2)}"
    sign     = ( big_end_int & 0x80   ) >> 7

    exp_a    = ( big_end_int & 0x7F   ) << 1   # get the last 7 bits, make it more significant
    exp_b    = ( big_end_int & 0x8000 ) >> 15  # get the 9th bit, to fill the sign gap
    exp_c    = ( big_end_int & 0x7000 ) >> 4   # get the 10-12th bit to stick on the front
    exponent = exp_a | exp_b | exp_c

    mant_a   = ( big_end_int & 0xFFFFFFFFFFFF0000 ) >> 12 # F000 was taken above
    mant_b   = ( big_end_int & 0x0000000000000F00 ) >> 8  #  F00 was left over
    mantissa = mant_a | mant_b

    [ sign, exponent, mantissa ]
  end

  # Big
  def ieee745_binary64_big_endian big_end_int
    sign     = ( big_end_int & 0x8000000000000000 ) >> 63
    exponent = ( big_end_int & 0x7FF0000000000000 ) >> 52
    mantissa = ( big_end_int & 0x000FFFFFFFFFFFFF ) >> 0

    [ sign, exponent, mantissa ]
  end
end

and testing...

def printer val, vals
  printf "%-15s   sign|%01b|
",            val,     vals[0]
  printf "  hex e|%3x|         m|%013x|
", vals[1], vals[2]
  printf "  bin e|%011b| m|%052b|

",     vals[1], vals[2]
end

floats = [ 12.125, -12.125, 1.0/3, -1.0/3, 1.0, -1.0, 1.131313131313, -1.131313131313 ]

floats.each do |v|
  printer v, v.ieee745_binary64
  printer v, v.ieee745_binary64_big
end

TIL my brain is big endian! You'll note the ints being worked with are both big endian. I failed at bit shifting the other way.

Patrice Gahide · Answer

Use frexp from the Math module. From the doc:

fraction, exponent = Math.frexp(1234)   #=> [0.6025390625, 11]
fraction * 2**exponent                  #=> 1234.0

The sign bit is easy to find on its own.

Converting a floating point to its corresponding bit-segments

Tags:

floating-point

ruby

John Feminella

Video Answer

2 Answers

Matt

Patrice Gahide

Recent Activity

Donate For Us

Converting a floating point to its corresponding bit-segments

Tags:

floating-point

ruby

John Feminella

Video Answer

2 Answers

Matt

Patrice Gahide

Related questions

Recent Activity

Donate For Us