Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Analizing MIPS binaries: is there a Python library for parsing binary data?

I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The MIPS binaries are in ELF format, using DWARF for the symbolic debugging information.

I'm currently planning to fork objdump, passing in a list of hex addresses and parsing the output to get function names and source line numbers. I have compiled an objdump with support for MIPS binaries, and it is working.

I'd prefer to have a package allowing me to look things up natively from the Python code without forking another process. I can find no mention of libdwarf, libelf, or libbfd on python.org, nor any mention of python on dwarfstd.org.

Is there a suitable module available somewhere?

like image 365
DGentry Avatar asked Sep 05 '08 14:09

DGentry


5 Answers

You might be interested in the DWARF library from pydevtools:

>>> from bintools.dwarf import DWARF
>>> dwarf = DWARF('test/test')
>>> dwarf.get_loc_by_addr(0x8048475)
('/home/emilmont/Workspace/dbg/test/main.c', 36, 0)
like image 64
emilmont Avatar answered Nov 05 '22 15:11

emilmont


Please check pyelftools - a new pure Python library meant to do this.

like image 5
Eli Bendersky Avatar answered Nov 05 '22 14:11

Eli Bendersky


You should give Construct a try. It is very useful to parse binary data into python objects.

There is even an example for the ELF32 file format.

like image 4
Ber Avatar answered Nov 05 '22 16:11

Ber


I don't know of any, but if all else fails you could use ctypes to directly use libdwarf, libelf or libbfd.

like image 3
Douglas Leeder Avatar answered Nov 05 '22 16:11

Douglas Leeder


I've been developing a DWARF parser using Construct. Currently fairly rough, and parsing is slow. But I thought I should at least let you know. It may suit your needs, with a bit of work.

I've got the code in Mercurial, hosted at bitbucket:

  • http://bitbucket.org/cmcqueen1975/pythondwarf/
  • http://bitbucket.org/cmcqueen1975/construct/ (necessary modifications to Construct library)

Construct is a very interesting library. DWARF is a complex format (as I'm discovering) and pushes Construct to its limits I think.

like image 3
Craig McQueen Avatar answered Nov 05 '22 16:11

Craig McQueen