Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I unpack binary hex formatted data in Python?

Tags:

python

hex

binary

Using the PHP pack() function, I have converted a string into a binary hex representation:

$string = md5(time); // 32 character length
$packed = pack('H*', $string);

The H* formatting means "Hex string, high nibble first".

To unpack this in PHP, I would simply use the unpack() function with the H* format flag.

How would I unpack this data in Python?

like image 201
davidmytton Avatar asked Oct 14 '08 11:10

davidmytton


2 Answers

There's an easy way to do this with the binascii module:

>>> import binascii
>>> print binascii.hexlify("ABCZ")
'4142435a'
>>> print binascii.unhexlify("4142435a")
'ABCZ'

Unless I'm misunderstanding something about the nibble ordering (high-nibble first is the default… anything different is insane), that should be perfectly sufficient!

Furthermore, Python's hashlib.md5 objects have a hexdigest() method to automatically convert the MD5 digest to an ASCII hex string, so that this method isn't even necessary for MD5 digests. Hope that helps.

like image 168
Dan Lenski Avatar answered Oct 26 '22 23:10

Dan Lenski


There's no corresponding "hex nibble" code for struct.pack, so you'll either need to manually pack into bytes first, like:

hex_string = 'abcdef12'

hexdigits = [int(x, 16) for x in hex_string]
data = ''.join(struct.pack('B', (high <<4) + low) 
               for high, low in zip(hexdigits[::2], hexdigits[1::2]))

Or better, you can just use the hex codec. ie.

>>> data = hex_string.decode('hex')
>>> data
'\xab\xcd\xef\x12'

To unpack, you can encode the result back to hex similarly

>>> data.encode('hex')
'abcdef12'

However, note that for your example, there's probably no need to take the round-trip through a hex representation at all when encoding. Just use the md5 binary digest directly. ie.

>>> x = md5.md5('some string')
>>> x.digest()
'Z\xc7I\xfb\xee\xc96\x07\xfc(\xd6f\xbe\x85\xe7:'

This is equivalent to your pack()ed representation. To get the hex representation, use the same unpack method above:

>>> x.digest().decode('hex')
'acbd18db4cc2f85cedef654fccc4a4d8'
>>> x.hexdigest()
'acbd18db4cc2f85cedef654fccc4a4d8'

[Edit]: Updated to use better method (hex codec)

like image 42
Kratz Avatar answered Oct 26 '22 23:10

Kratz