Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disassembling with python - no easy solution?

I'm trying to create a python script that will disassemble a binary (a Windows exe to be precise) and analyze its code. I need the ability to take a certain buffer, and extract some sort of struct containing information about the instructions in it.

I've worked with libdisasm in C before, and I found it's interface quite intuitive and comfortable. The problem is, its Python interface is available only through SWIG, and I can't get it to compile properly under Windows.

At the availability aspect, diStorm provides a nice out-of-the-box interface, but it provides only the Mnemonic of each instruction, and not a binary struct with enumerations defining instruction type and what not. This is quite uncomfortable for my purpose, and will require a lot of what I see as spent time wrapping the interface to make it fit my needs.

I've also looked at BeaEngine, which does in fact provide the output I need, a struct with binary info concerning each instruction, but its interface is really odd and counter-intuitive, and it crashes pretty much instantly when provided with wrong arguments. The CTypes sort of ultimate-death-to-your-python crashes.

So, I'd be happy to hear about other solutions, which are a little less time consuming than messing around with djgcc or mingw to make SWIGed libdisasm, or writing an OOP wrapper for diStorm. If anyone has some guidance as to how to compile SWIGed libdisasm, or better yet, a compiled binary (pyd or dll+py), I'd love to hear/have it. :)

Thanks ahead.

like image 875
Abc4599 Avatar asked Feb 07 '10 13:02

Abc4599


3 Answers

Well, after much meddling around, I managed to compile SWIGed libdisasm! Unfortunately, it seems to crash python on incorrect (and sometimes correct) usage. How I did it:

  1. I compiled libdisasm.lib using Visual Studio 6, the only thing you need for this is the source code in whichever libdisasm release you use, and stdint.h and inttypes.h (The Visual C++ compatible version, google it).
  2. I SWIGed the given libdisasm_oop.i file with the following command line

    swig -python -shadow -o x86disasm_wrap.c -outdir . libdisasm_oop.i

  3. Used Cygwin to run ./configure in the libdisasm root dir. The only real thing you get from this is config.h

  4. I then created a new DLL project, added x86disasm_wrap.c to it, added the c:\PythonXX\libs and c:\PythonXX\Include folders to the corresponding variables, set to Release configuration (important, either this or do #undef _DEBUG before including python.h). Also, there is a chance you'll need to fix the path to config.h.

  5. Compiled the DLL project, and named the output _x86disasm.dll. Place that in the same folder as the SWIG generated x86disasm.py and you're done.

Any suggestions for other, less crashy disasm libs for python?

like image 119
Abc4599 Avatar answered Nov 20 '22 15:11

Abc4599


You might try using ctypes to interface directly with libdisasm instead of going through a SWIG layer. It may be take more development time but AFAIK you should be able to access the underlying functionality using ctypes.

like image 29
Vinay Sajip Avatar answered Nov 20 '22 14:11

Vinay Sajip


I recommend you look at Pym's disassembly library which is also the backend for Pym's online disassembler.

like image 2
Nick Sonneveld Avatar answered Nov 20 '22 14:11

Nick Sonneveld