Unpacking a struct ending with an ASCIIZ string

Tags:

struct

I am trying to use struct.unpack() to take apart a data record that ends with an ASCII string.

The record (it happens to be a TomTom ov2 record) has this format (stored little-endian):

1 byte
4 byte int for total record size (including this field)
4 byte int
4 byte int
variable-length string, null-terminated

unpack() requires that the string's length be included in the format you pass it. I can use the second field and the known size of the rest of the record -- 13 bytes -- to get the string length:

str_len = struct.unpack("<xi", record[:5])[0] - 13
fmt = "<biii{0}s".format(str_len)

then proceed with the full unpacking, but since the string is null-terminated, I really wish unpack() would do it for me. It'd also be nice to have this should I run across a struct that doesn't include its own size.

How can I make that happen?

634

asked Aug 07 '12 17:08

jscs

2 Answers

I made two new functions that should be useable as drop-in replacements for the standard pack and unpack functions. They both support the 'z' character to pack/unpack an ASCIIZ string. There are no restrictions to the location or number of occurrences of the 'z' character in the format string:

import struct

def unpack (format, buffer) :
    while True :
        pos = format.find ('z')
        if pos < 0 :
            break
        asciiz_start = struct.calcsize (format[:pos])
        asciiz_len = buffer[asciiz_start:].find('\0')
        format = '%s%dsx%s' % (format[:pos], asciiz_len, format[pos+1:])
    return struct.unpack (format, buffer)

def pack (format, *args) :
    new_format = ''
    arg_number = 0
    for c in format :
        if c == 'z' :
            new_format += '%ds' % (len(args[arg_number])+1)
            arg_number += 1
        else :
            new_format += c
            if c in 'cbB?hHiIlLqQfdspP' :
                arg_number += 1
    return struct.pack (new_format, *args)

Here's an example of how to use them:

>>> from struct_z import pack, unpack
>>> line = pack ('<izizi', 1, 'Hello', 2, ' world!', 3)
>>> print line.encode('hex')
0100000048656c6c6f000200000020776f726c64210003000000
>>> print unpack ('<izizi',line)
(1, 'Hello', 2, ' world!', 3)
>>>

answered Sep 20 '22 11:09

Arthur

The size-less record is fairly easy to handle, actually, since struct.calcsize() will tell you the length it expects. You can use that and the actual length of the data to construct a new format string for unpack() that includes the correct string length.

This function is just a wrapper for unpack(), allowing a new format character in the last position that will drop the terminal NUL:

import struct
def unpack_with_final_asciiz(fmt, dat):
    """
    Unpack binary data, handling a null-terminated string at the end 
    (and only at the end) automatically.

    The first argument, fmt, is a struct.unpack() format string with the 
    following modfications:
    If fmt's last character is 'z', the returned string will drop the NUL.
    If it is 's' with no length, the string including NUL will be returned.
    If it is 's' with a length, behavior is identical to normal unpack().
    """
    # Just pass on if no special behavior is required
    if fmt[-1] not in ('z', 's') or (fmt[-1] == 's' and fmt[-2].isdigit()):
        return struct.unpack(fmt, dat)

    # Use format string to get size of contained string and rest of record
    non_str_len = struct.calcsize(fmt[:-1])
    str_len = len(dat) - non_str_len

    # Set up new format string
    # If passed 'z', treat terminating NUL as a "pad byte"
    if fmt[-1] == 'z':
        str_fmt = "{0}sx".format(str_len - 1)
    else:
        str_fmt = "{0}s".format(str_len)
    new_fmt = fmt[:-1] + str_fmt

    return struct.unpack(new_fmt, dat)

>>> dat = b'\x02\x1e\x00\x00\x00z\x8eJ\x00\xb1\x7f\x03\x00Down by the river\x00'
>>> unpack_with_final_asciiz("<biiiz", dat)
(2, 30, 4886138, 229297, b'Down by the river')

answered Sep 19 '22 11:09

jscs

Related questions
                            
                                Function polymorphism in Python
                            
                                Python - Ubuntu install for SQLAlchemy not working
                            
                                How to get all the maximums max function
                            
                                Understanding global variable in Python
                            
                                How do I know which python script is running in taskmgr?
                            
                                How can I configure Pyramid's JSON encoding?
                            
                                Creating a Table with rows of different heights in reportlab
                            
                                Report Lab can't handle hebrew (unicode)
                            
                                python re.sub with a list of words to find
                            
                                SQLAlchemy declarative extension vs. elixir
                            
                                Flask: Using multiple packages in one app
                            
                                How to recognize histograms with a specific shape in opencv / python
                            
                                python pexpect sendcontrol key characters
                            
                                Fuzzy Group By, Grouping Similar Words
                            
                                Tkinter Resize text to contents
                            
                                Is there a better way to broadcast arrays?
                            
                                How to install scikit-learn on heroku cedar?
                            
                                Django: empty form errors
                            
                                Python 3.2 installation on Ubuntu 12.04
                            
                                python encoding error only when called as external process

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unpacking a struct ending with an ASCIIZ string

Tags:

python

struct

jscs

People also ask

2 Answers

Arthur

jscs

Recent Activity

Donate For Us