Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pythonic way to hex dump files

Tags:

python

hexdump

my question is simple:

Is there any way to code in a pythonic way that bash command?

hexdump -e '2/1 "%02x"' file.dat

Obviously, without using os, popen, or any shortcut ;)

EDIT: although I've not explicitly specified, it would be great if the code was functional in Python3.x

Thanks!

like image 636
peluzza Avatar asked Jul 28 '14 22:07

peluzza


People also ask

How do I read a hex dump?

The address of a hex dump counts tracks the number of bytes in the data and offsets each line by that number. So the first line starts at offset 0, and the second line represents the number 16, which is how many bytes precede the current line.

What is hex dump file?

In computing, a hex dump is a hexadecimal view (on screen or paper) of computer data, from memory or from a computer file or storage device. Looking at a hex dump of data is usually done in the context of either debugging, reverse engineering or digital forensics.

How do I remove a hex dump?

You can reverse such a dump using xxd -r -p .


1 Answers

If you only care about Python 2.x, line.encode('hex') will encode a chunk of binary data into hex. So:

with open('file.dat', 'rb') as f:
    for chunk in iter(lambda: f.read(32), b''):
        print chunk.encode('hex')

(IIRC, hexdump by default prints 32 pairs of hex per line; if not, just change that 32 to 16 or whatever it is…)

If the two-argument iter looks baffling, click the help link; it's not too complicated once you get the idea.

If you care about Python 3.x, encode only works for codecs that convert Unicode strings to bytes; any codecs that convert the other way around (or any other combination), you have to use codecs.encode to do it explicitly:

with open('file.dat', 'rb') as f:
    for chunk in iter(lambda: f.read(32), b''):
        print(codecs.encode(chunk, 'hex'))

Or it may be better to use hexlify:

with open('file.dat', 'rb') as f:
    for chunk in iter(lambda: f.read(32), b''):
        print(binascii.hexlify(chunk))

If you want to do something besides print them out, rather than read the whole file into memory, you probably want to make an iterator. You could just put this in a function and change that print to a yield, and that function returns exactly the iterator you want. Or use a genexpr or map call:

with open('file.dat', 'rb') as f:
    chunks = iter(lambda: f.read(32), b'')
    hexlines = map(binascii.hexlify, chunks)
like image 158
abarnert Avatar answered Sep 30 '22 20:09

abarnert