Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing binary data into ctypes Structure object via readinto()

Tags:

I'm trying to handle a binary format, following the example here:

http://dabeaz.blogspot.jp/2009/08/python-binary-io-handling.html

>>> from ctypes import *
>>> class Point(Structure):
>>>     _fields_ = [ ('x',c_double), ('y',c_double), ('z',c_double) ]
>>>
>>> g = open("foo","rb") # point structure data
>>> q = Point()
>>> g.readinto(q)
24
>>> q.x
2.0

I've defined a Structure of my header and I'm trying to read data into my structure, but I'm having some difficulty. My structure is like this:

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", c_char),
                ("timestamp_4bytes", c_uint),
                ("more_funky_numbers_7bytes", c_uint, 56),
                ("some_flags_1byte", c_byte),
                ("other_flags_1byte", c_byte),
                ("payload_length_2bytes", c_ushort),

                ] 

The ctypes documentation says:

For integer type fields like c_int, a third optional item can be given. It must be a small positive integer defining the bit width of the field.

So for ("more_funky_numbers_7bytes", c_uint, 56), I've tried to define the field as a 7 byte field, but I'm getting the error:

ValueError: number of bits invalid for bit field

So my first problem, is how can I define a 7 byte int field?

Then If I skip that problem and comment out the "more_funky_numbers_7bytes" field, the resulting data get's loaded in.. but as expected only 1 character is loaded into "ascii_text_32bytes". And for some reason returns 16 which I assume is the calculated number of bytes it read into the structure... but If I'm commenting out my "funky number" field and ""ascii_text_32bytes" is only giving one char (1 byte), shouldn't that be 13, not 16???

Then I tried breaking out the char field into a separate structure, and reference that from within my Header structure. But that's not working either...

class StupidStaticCharField(BigEndianStructure):
    _fields_ = [
                ("ascii_text_1", c_byte),
                ("ascii_text_2", c_byte),
                ("ascii_text_3", c_byte),
                ("ascii_text_4", c_byte),
                ("ascii_text_5", c_byte),
                ("ascii_text_6", c_byte),
                ("ascii_text_7", c_byte),
                ("ascii_text_8", c_byte),
                ("ascii_text_9", c_byte),
                ("ascii_text_10", c_byte),
                ("ascii_text_11", c_byte),
                .
                .
                .
                ]

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", StupidStaticCharField),
                ("timestamp_4bytes", c_uint),
                #("more_funky_numbers_7bytes", c_uint, 56),
                ("some_flags_1byte", c_ushort),
                ("other_flags_1byte", c_ushort),
                ("payload_length_2bytes", c_ushort),

                ] 

So, any ideas how to:

  1. Define a 7 byte field (which I'll need to decode using a defined function)
  2. Define a static char field of 32 bytes

UPDATE

I've found a structure that seems to work...

class BinaryHeader(BigEndianStructure):
    _fields_ = [
                ("sequence_number_4bytes", c_uint),
                ("ascii_text_32bytes", c_char * 32),
                ("timestamp_4bytes", c_uint),
                ("more_funky_numbers_7bytes", c_byte * 7),
                ("some_flags_1byte", c_byte),
                ("other_flags_1byte", c_byte),
                ("payload_length_2bytes", c_ushort),

                ]  

Now, however, my remaining question is, why when use .readinto():

f = open(binaryfile, "rb")

mystruct = BinaryHeader()
f.readinto(mystruct)

It's returning 52 and not the expected, 51. Where is that extra byte coming from, and where does it go?

UPDATE 2 For those interested here's an example of an alternative struct method to read values into a namedtuple mentioned by eryksun:

>>> record = 'raymond   \x32\x12\x08\x01\x08'
>>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)

>>> from collections import namedtuple
>>> Student = namedtuple('Student', 'name serialnum school gradelevel')
>>> Student._make(unpack('<10sHHb', record))
Student(name='raymond   ', serialnum=4658, school=264, gradelevel=8)