I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The MIPS binaries are in ELF format, using DWARF for the symbolic debugging information.
I'm currently planning to fork objdump, passing in a list of hex addresses and parsing the output to get function names and source line numbers. I have compiled an objdump with support for MIPS binaries, and it is working.
I'd prefer to have a package allowing me to look things up natively from the Python code without forking another process. I can find no mention of libdwarf, libelf, or libbfd on python.org, nor any mention of python on dwarfstd.org.
Is there a suitable module available somewhere?
You might be interested in the DWARF library from pydevtools:
>>> from bintools.dwarf import DWARF
>>> dwarf = DWARF('test/test')
>>> dwarf.get_loc_by_addr(0x8048475)
('/home/emilmont/Workspace/dbg/test/main.c', 36, 0)
Please check pyelftools - a new pure Python library meant to do this.
You should give Construct a try. It is very useful to parse binary data into python objects.
There is even an example for the ELF32 file format.
I don't know of any, but if all else fails you could use ctypes to directly use libdwarf, libelf or libbfd.
I've been developing a DWARF parser using Construct. Currently fairly rough, and parsing is slow. But I thought I should at least let you know. It may suit your needs, with a bit of work.
I've got the code in Mercurial, hosted at bitbucket:
Construct is a very interesting library. DWARF is a complex format (as I'm discovering) and pushes Construct to its limits I think.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With