Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Translation of machinecode into LLVM IR (disassembly / reassembly of X86_64. X86. ARM into LLVM bitcode)

I would like to translate X86_64, x86, ARM executables into LLVM IR (disassembly).

What solution do you suggest ?

like image 807
Grzegorz Wierzowiecki Avatar asked Aug 08 '11 12:08

Grzegorz Wierzowiecki


People also ask

What is LLVM Bitcode?

What is commonly known as the LLVM bitcode file format (also, sometimes anachronistically known as bytecode) is actually two things: a bitstream container format and an encoding of LLVM IR into the container format. The bitstream format is an abstract encoding of structured data, very similar to XML in some ways.

Is LLVM IR assembly?

LLVM IR is a low-level intermediate representation used by the LLVM compiler framework. You can think of LLVM IR as a platform-independent assembly language with an infinite number of function local registers.

Does LLVM have an assembler?

DESCRIPTION. llvm-as is the LLVM assembler. It reads a file containing human-readable LLVM assembly language, translates it to LLVM bitcode, and writes the result into a file or to standard output.


2 Answers

mcsema is a production-quality binary lifter. It takes x86 and x86-64 and statically "lifts" it to LLVM IR. It's actively maintained, BSD licensed, and has extensive tests and documentation.

https://github.com/trailofbits/mcsema

like image 189
Dan Avatar answered Oct 16 '22 06:10

Dan


Consider using RevGen tool developed within the S2E project. It allows converting x86 binaries to LLVM IR. The source code could be checked out from Revgen branch of GIT repository available by url https://dslabgit.epfl.ch/git/s2e/s2e.git.

like image 20
bsa2000 Avatar answered Oct 16 '22 06:10

bsa2000