Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unpacking hex-encoded floats

I'm trying to translate the following Python code into C++:

import struct
import binascii


inputstring = ("0000003F" "0000803F" "AD10753F" "00000080")
num_vals = 4

for i in range(num_vals):
    rawhex = inputstring[i*8:(i*8)+8]

    # <f for little endian float
    val = struct.unpack("<f", binascii.unhexlify(rawhex))[0]
    print val

    # Output:
    # 0.5
    # 1.0
    # 0.957285702229
    # -0.0

So it reads 32-bit worth of the hex-encoded string, turns it into a byte-array with the unhexlify method, and interprets it as a little-endian float value.

The following almost works, but the code is kind of crappy (and the last 00000080 parses incorrectly):

#include <sstream>
#include <iostream>


int main()
{
    // The hex-encoded string, and number of values are loaded from a file.
    // The num_vals might be wrong, so some basic error checking is needed.
    std::string inputstring = "0000003F" "0000803F" "AD10753F" "00000080";
    int num_vals = 4;


    std::istringstream ss(inputstring);

    for(unsigned int i = 0; i < num_vals; ++i)
    {
        char rawhex[8];

// The ifdef is wrong. It is not the way to detect endianness (it's
// always defined)
#ifdef BIG_ENDIAN
        rawhex[6] = ss.get();
        rawhex[7] = ss.get();

        rawhex[4] = ss.get();
        rawhex[5] = ss.get();

        rawhex[2] = ss.get();
        rawhex[3] = ss.get();

        rawhex[0] = ss.get();
        rawhex[1] = ss.get();
#else
        rawhex[0] = ss.get();
        rawhex[1] = ss.get();

        rawhex[2] = ss.get();
        rawhex[3] = ss.get();

        rawhex[4] = ss.get();
        rawhex[5] = ss.get();

        rawhex[6] = ss.get();
        rawhex[7] = ss.get();
#endif

        if(ss.good())
        {
            std::stringstream convert;
            convert << std::hex << rawhex;
            int32_t val;
            convert >> val;

            std::cerr << (*(float*)(&val)) << "\n";
        }
        else
        {
            std::ostringstream os;
            os << "Not enough values in LUT data. Found " << i;
            os << ". Expected " << num_vals;
            std::cerr << os.str() << std::endl;
            throw std::exception();
        }
    }
}

(compiles on OS X 10.7/gcc-4.2.1, with a simple g++ blah.cpp)

Particularly, I'd like to get rid of the BIG_ENDIAN macro stuff, as I'm sure there is a nicer way to do this, as this post discusses.

Few other random details - I can't use Boost (too large a dependency for the project). The string will usually contain between 1536 (83*3) and 98304 float values (323*3), at most 786432 (643*3)

(edit2: added another value, 00000080 == -0.0)

like image 304
dbr Avatar asked Apr 07 '12 12:04

dbr


2 Answers

The following is your updated code modified to remove the #ifdef BIG_ENDIAN block. It uses a read technique that should be host byte order independent. It does this by reading the hex bytes (which are little endian in your source string) into a big endian string format compatible with the iostream std::hex operator. Once in this format it should not matter what the host byte order is.

Additionally, it fixes a bug in that rawhex needs to be zero terminated to be inserted into convert without trailing garbage in some cases.

I do not have a big endian system to test on, so please verify on your platform. This was compiled and tested under Cygwin.

#include <sstream>
#include <iostream>

int main()
{
    // The hex-encoded string, and number of values are loaded from a file.
    // The num_vals might be wrong, so some basic error checking is needed.
    std::string inputstring = "0000003F0000803FAD10753F00000080";
    int num_vals = 4;
    std::istringstream ss(inputstring);
    size_t const k_DataSize = sizeof(float);
    size_t const k_HexOctetLen = 2;

    for (uint32_t i = 0; i < num_vals; ++i)
    {
        char rawhex[k_DataSize * k_HexOctetLen + 1];

        // read little endian string into memory array
        for (uint32_t j=k_DataSize; (j > 0) && ss.good(); --j)
        {
            ss.read(rawhex + ((j-1) * k_HexOctetLen), k_HexOctetLen);
        }

        // terminate the string (needed for safe conversion)
        rawhex[k_DataSize * k_HexOctetLen] = 0;

        if (ss.good())
        {
            std::stringstream convert;
            convert << std::hex << rawhex;
            uint32_t val;
            convert >> val;

            std::cerr << (*(float*)(&val)) << "\n";
        }
        else
        {
            std::ostringstream os;
            os << "Not enough values in LUT data. Found " << i;
            os << ". Expected " << num_vals;
            std::cerr << os.str() << std::endl;
            throw std::exception();
        }
    }
}
like image 118
Amardeep AC9MF Avatar answered Oct 05 '22 00:10

Amardeep AC9MF


I think the whole istringstring business is an overkill. It's much easier to parse this yourself one digit at a time.

First, create a function to convert a hex digit into an integer:

signed char htod(char c)
{
  c = tolower(c);
  if(isdigit(c))
    return c - '0';

  if(c >= 'a' && c <= 'f')
    return c - 'a' + 10;

  return -1;
}

Then simply convert the string into an integer. The code below doesn't check for errors and assumes big endianness -- but you should be able to fill in the details.

unsigned long t = 0;
for(int i = 0; i < s.length(); ++i)
  t |= (t << 4) & htod(s[i]);

Then your float is

float f = * (float *) &t;
like image 44
George Skoptsov Avatar answered Oct 04 '22 23:10

George Skoptsov