Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Struct Unpack

Tags:

python

struct

I have this little problem that has been bugging me for the past hour or so.

string = b'-'
t = struct.pack(">h%ds" % len(string), len(string), string)
print(t)

the result of this pack is b'\x00\x01-'

The problem I'm having is that I can't figure out how to unpack the result b'\x00\x01-' so that it's just '-', Yes. I know I can just remove the crap in the front but it gets a little bit more complicated. I tried to simplify it here. Hopefully someone can assist me. :)

like image 845
dbdii407 Avatar asked Mar 01 '11 23:03

dbdii407


2 Answers

Normally you wouldn't use struct.pack to put a length header and the value together. Instead you would just do struct.pack(">h", len(data)), send that over the line (for example in network protocol) and then send the data. No need to create a new bytes buffer.

In your case, you could simply do:

dataLength, = struct.unpack(">h", t[:2])
data = t[2:2+dataLength]

but as I said, if you have a socket-based application for instance, it would be like so:

header = receive(2)
dataLength, = struct.unpack(">h", header)
data = receive(dataLength)
like image 81
AndiDog Avatar answered Sep 24 '22 02:09

AndiDog


import struct
string = b'-'
fmt=">h%ds" % len(string)

Here you are packing both the length and the string:

t = struct.pack(fmt, len(string), string)
print(repr(t))
# '\x00\x01-'

So when you unpack, you should expect to get two values back, i.e., the length and the string:

length,string2=struct.unpack(fmt,t)
print(repr(string2))
# '-'

In general, if you don't know how the string was packed, then there is no sure-fire way to recover the data. You'd just have to guess!

If you know the data is composed of the length of the string, and then the string itself, then you could try trial-and-error:

import struct
string = b'-'
fmt=">h%ds" % len(string)
t = struct.pack(fmt, len(string), string)
print(repr(t))

for endian in ('>','<'):
    for fmt,size in (('b',1),('B',1),('h',2),('H',2),('i',4),('I',4),
                     ('l',4),('L',4),('q',8),('Q',8)):
        fmt=endian+fmt
        try:
            length,=struct.unpack(fmt,t[:size])
        except struct.error:
            pass
        else:
            fmt=fmt+'{0}s'.format(length)
            try:
                length,string2=struct.unpack(fmt,t)
            except struct.error:
                pass
            else:
                print(fmt,length,string2)
# ('>h1s', 1, '-')
# ('>H1s', 1, '-')

It might be possible to compose an ambiguous string t which has multiple valid unpackings which would lead to different string2s, however. I'm not sure.

like image 22
unutbu Avatar answered Sep 24 '22 02:09

unutbu