I am trying to disassemble some bytes using LLVM's C interface .
However LLVMCreateDisasm()
returns NULL.
#include <stdio.h> // printf()
#include <stdlib.h> // EXIT_FAILURE, EXIT_SUCCESS
#define __STDC_CONSTANT_MACROS // llvm complains otherwise
#define __STDC_LIMIT_MACROS
#include <llvm-c/Disassembler.h>
int main()
{
LLVMDisasmContextRef dc = LLVMCreateDisasm (
"testname",
NULL,
0,
NULL,
NULL
);
if (dc == NULL) {
printf("Could not create disassembler");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
I am on x64 Linux. Looking at the documentation seems like I am doing everything right.
LLVMDisasmContextRef LLVMCreateDisasm (
const char * TripleName,
void * DisInfo,
int TagType,
LLVMOpInfoCallback GetOpInfo,
LLVMSymbolLookupCallback SymbolLookUp
)
Create a disassembler for the TripleName. Symbolic disassembly is supported by passing a block of information in the DisInfo parameter and specifying the TagType and callback functions as described above. These can all be passed as NULL. If successful, this returns a disassembler context. If not, it returns NULL.
Inserted printf
's in lib/MC/MCDisassembler/Disassembler.cpp: LLVMCreateDisasmCPU() and it fails upon first if
check. The Error
string at that point is "Unable to find target for this triple (no targets are registered)"
LLVMDisasmContextRef LLVMCreateDisasmCPU(const char *Triple, const char *CPU,
void *DisInfo, int TagType,
LLVMOpInfoCallback GetOpInfo,
LLVMSymbolLookupCallback SymbolLookUp){
std::cout << ">>> Triplename: " << Triple << std::endl;
// Get the target.
std::string Error;
const Target *TheTarget = TargetRegistry::lookupTarget(Triple, Error);
if (!TheTarget) {
std::cout << "Failed 1: " << Error << std::endl;
return 0;
}
...
So it fails at lookupTarget
call.
Looking at lib/Support/TargetRegistry.cpp: lookupTarget() it fails upon first if
check. The comment there gives some clues:
const Target *TargetRegistry::lookupTarget(const std::string &TT,
std::string &Error) {
// Provide special warning when no targets are initialized.
if (begin() == end()) {
Error = "Unable to find target for this triple (no targets are registered)";
return 0;
}
...
So it turns out I have to initialize a target first.
In my code I first call LLVMInitializeAllTargetInfos();
from the llvm-c/Target.h
header. Now it fails on second if
check in Disassembler.cpp: LLVMCreateDisasmCPU()
const MCRegisterInfo *MRI = TheTarget->createMCRegInfo(Triple);
if (!MRI) {
std::cout << "Failed 2: " << Error << std::endl;
return 0;
}
with this Error
string: Could not create disassembler
I just had to call LLVMInitializeAllTargetInfos();
, LLVMInitializeAllTargetMCs();
, LLVMInitializeAllDisassemblers();
before creating disasm context:
#include <stdio.h> // printf()
#include <stdlib.h> // EXIT_FAILURE, EXIT_SUCCESS
#define __STDC_CONSTANT_MACROS // llvm complains otherwise
#define __STDC_LIMIT_MACROS
#include <llvm-c/Disassembler.h>
#include <llvm-c/Target.h>
int main()
{
LLVMInitializeAllTargetInfos();
LLVMInitializeAllTargetMCs();
LLVMInitializeAllDisassemblers();
LLVMDisasmContextRef dc = LLVMCreateDisasm (
"x86_64-unknown-linux-gnu",
NULL,
0,
NULL,
NULL
);
if (dc == NULL) {
printf("Could not create disassembler");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
The first argument of LLVMCreateDisasm
, the
"testname"
is not valid TripleName. TripleName will instruct the LLVM what is the your target, and this is needed because LLVM contains support for several targets in the single installation.
You can list supported targets architectures by running command
llc -version
And there are targets for x86 and x86_64
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
To construct the correct TripleName you should find for your target some good subarch (i486 or x86_64), then add vendor and OS:
http://llvm.org/docs/doxygen/html/Triple_8h_source.html
00022 /// Triple - Helper class for working with autoconf configuration names. For
00023 /// historical reasons, we also call these 'triples' (they used to contain
00024 /// exactly three fields).
00025 ///
00026 /// Configuration names are strings in the canonical form:
00027 /// ARCHITECTURE-VENDOR-OPERATING_SYSTEM
00028 /// or
00029 /// ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
There is ArchType enum here with list of recognized Arch's in the comment (actual parser is lib/Support/Triple.cpp - parseArch), like
arm, // ARM: arm, armv.*, xscale
aarch64, // AArch64: aarch64
....
x86, // X86: i[3-9]86
x86_64, // X86-64: amd64, x86_64
In the same file there are valid vendors (enum VendorType
), OS types (enum OSType
) and Envronments (enum EnvironmentType
). In most cases you can use "unknown" for vendor and os, but often "-unknown-linux-gnu" is used.
Some examples of valid TripleName
s:
x86_64--linux-gnu
x86_64-unknown-linux-gnu
i486--linux-gnu
There are more description of valid triples of clang here: http://clang.llvm.org/docs/CrossCompilation.html and some valid names listed in https://stackoverflow.com/a/18576360/196561
Another limitation in LLVMCreateDisasm
is that not all Targets has the MCDisassembler
s implemented. For example, in LLVM-2.9 there are MCDissasemblers only for X86, X86_64, ARM and MBlaze; in more recent (svn from 2014-02-01) also for Sparc, PPC, MIPS, SystemZ, XCore and AArch64.
If you was unable to create MCDisassembler even with correct triple, there are several options to debug the LLVMCreateDisasmCPU
function from MC/MCDisassembler/Disassembler.cpp file. You can break into in with gdb, and then do "next"-stepping until error (this will be more beautiful and easier with debug build of LLVM); or you can add some debugging printf's into the LLVMCreateDisasmCPU
or temporary change return value from plain NULL info some different for each of error.
UPDATE: Seems that your LLVM was not initialized at time of call. There are many LLVM Initializers in the llvm-c/Target.h header in current LLVM (~3.4 or newer):
LLVMInitializeAllTargetInfos()
- The main program should call this function if it wants access to all available targets that LLVM is configured to support.
LLVMInitializeAllTargets()
- The main program should call this function if it wants to link in all available targets that LLVM is configured to support.
LLVMInitializeAllTargetMCs()
- The main program should call this function if it wants access to all available target MC that LLVM is configured to support.
LLVMInitializeAllDisassemblers()
- The main program should call this function if it wants all disassemblers that LLVM is configured to support, to make them available via the TargetRegistry.
LLVMInitializeAllAsmPrinters()
- The main program should call this function if it wants all asm printers that LLVM is configured to support, to make them available via the TargetRegistry.
and so on (https://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Target.h?logsort=rev&diff_format=h&r1=192697&r2=192696&pathrev=192697).
There is even LLVMInitializeNativeTarget
function which initializes native target:
LLVMInitializeNativeTarget()
- The main program should call this function to initialize the native target corresponding to the host. This is useful for JIT applications to ensure that the target gets linked in correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With