Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LLVMCreateDisasm returns NULL

I am trying to disassemble some bytes using LLVM's C interface . However LLVMCreateDisasm() returns NULL.

#include <stdio.h> // printf()
#include <stdlib.h> // EXIT_FAILURE, EXIT_SUCCESS

#define __STDC_CONSTANT_MACROS // llvm complains otherwise
#define __STDC_LIMIT_MACROS
#include <llvm-c/Disassembler.h>

int main()
{
    LLVMDisasmContextRef dc = LLVMCreateDisasm (
        "testname",
        NULL,
        0,
        NULL,
        NULL
    );
    if (dc == NULL) {
        printf("Could not create disassembler");
        return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

I am on x64 Linux. Looking at the documentation seems like I am doing everything right.

LLVMDisasmContextRef LLVMCreateDisasm   (
    const char *             TripleName,
    void *                   DisInfo,
    int                      TagType,
    LLVMOpInfoCallback       GetOpInfo,
    LLVMSymbolLookupCallback SymbolLookUp 
)

Create a disassembler for the TripleName. Symbolic disassembly is supported by passing a block of information in the DisInfo parameter and specifying the TagType and callback functions as described above. These can all be passed as NULL. If successful, this returns a disassembler context. If not, it returns NULL.

Update

  1. My llvm version is 3.4
  2. I tried every possible triple/target I could think of, still the same.
  3. Inserted printf's in lib/MC/MCDisassembler/Disassembler.cpp: LLVMCreateDisasmCPU() and it fails upon first if check. The Error string at that point is "Unable to find target for this triple (no targets are registered)"

    LLVMDisasmContextRef LLVMCreateDisasmCPU(const char *Triple, const char *CPU,
                                         void *DisInfo, int TagType,
                                         LLVMOpInfoCallback GetOpInfo,
                                         LLVMSymbolLookupCallback SymbolLookUp){
        std::cout << ">>> Triplename: " << Triple << std::endl;
        // Get the target.
        std::string Error;
        const Target *TheTarget = TargetRegistry::lookupTarget(Triple, Error);
        if (!TheTarget) {
            std::cout << "Failed 1: " << Error << std::endl;
            return 0;
        }
        ...
    

    So it fails at lookupTarget call.

  4. Looking at lib/Support/TargetRegistry.cpp: lookupTarget() it fails upon first if check. The comment there gives some clues:

    const Target *TargetRegistry::lookupTarget(const std::string &TT,
                                               std::string &Error) {
        // Provide special warning when no targets are initialized.
        if (begin() == end()) {
            Error = "Unable to find target for this triple (no targets are registered)";
            return 0;
        }
        ...
    

    So it turns out I have to initialize a target first.

  5. In my code I first call LLVMInitializeAllTargetInfos(); from the llvm-c/Target.h header. Now it fails on second if check in Disassembler.cpp: LLVMCreateDisasmCPU()

    const MCRegisterInfo *MRI = TheTarget->createMCRegInfo(Triple);
    if (!MRI) {
        std::cout << "Failed 2: " << Error << std::endl;
        return 0;
    }
    

    with this Error string: Could not create disassembler

Finally Solved!

I just had to call LLVMInitializeAllTargetInfos();, LLVMInitializeAllTargetMCs();, LLVMInitializeAllDisassemblers(); before creating disasm context:

#include <stdio.h> // printf()
#include <stdlib.h> // EXIT_FAILURE, EXIT_SUCCESS

#define __STDC_CONSTANT_MACROS // llvm complains otherwise
#define __STDC_LIMIT_MACROS
#include <llvm-c/Disassembler.h>
#include <llvm-c/Target.h>

int main()
{
    LLVMInitializeAllTargetInfos();
    LLVMInitializeAllTargetMCs();
    LLVMInitializeAllDisassemblers();

    LLVMDisasmContextRef dc = LLVMCreateDisasm (
        "x86_64-unknown-linux-gnu",
        NULL,
        0,
        NULL,
        NULL
    );
    if (dc == NULL) {
        printf("Could not create disassembler");
        return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}
like image 920
Babken Vardanyan Avatar asked Jan 30 '14 17:01

Babken Vardanyan


1 Answers

The first argument of LLVMCreateDisasm, the

"testname"

is not valid TripleName. TripleName will instruct the LLVM what is the your target, and this is needed because LLVM contains support for several targets in the single installation.

You can list supported targets architectures by running command

llc  -version

And there are targets for x86 and x86_64

x86      - 32-bit X86: Pentium-Pro and above
x86-64   - 64-bit X86: EM64T and AMD64

To construct the correct TripleName you should find for your target some good subarch (i486 or x86_64), then add vendor and OS:

http://llvm.org/docs/doxygen/html/Triple_8h_source.html

00022 /// Triple - Helper class for working with autoconf configuration names. For
00023 /// historical reasons, we also call these 'triples' (they used to contain
00024 /// exactly three fields).
00025 ///
00026 /// Configuration names are strings in the canonical form:
00027 ///   ARCHITECTURE-VENDOR-OPERATING_SYSTEM
00028 /// or
00029 ///   ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT

There is ArchType enum here with list of recognized Arch's in the comment (actual parser is lib/Support/Triple.cpp - parseArch), like

arm,     // ARM: arm, armv.*, xscale
aarch64, // AArch64: aarch64
....
x86,     // X86: i[3-9]86
x86_64,  // X86-64: amd64, x86_64

In the same file there are valid vendors (enum VendorType), OS types (enum OSType) and Envronments (enum EnvironmentType). In most cases you can use "unknown" for vendor and os, but often "-unknown-linux-gnu" is used.

Some examples of valid TripleNames:

x86_64--linux-gnu
x86_64-unknown-linux-gnu
i486--linux-gnu

There are more description of valid triples of clang here: http://clang.llvm.org/docs/CrossCompilation.html and some valid names listed in https://stackoverflow.com/a/18576360/196561

Another limitation in LLVMCreateDisasm is that not all Targets has the MCDisassemblers implemented. For example, in LLVM-2.9 there are MCDissasemblers only for X86, X86_64, ARM and MBlaze; in more recent (svn from 2014-02-01) also for Sparc, PPC, MIPS, SystemZ, XCore and AArch64.

If you was unable to create MCDisassembler even with correct triple, there are several options to debug the LLVMCreateDisasmCPU function from MC/MCDisassembler/Disassembler.cpp file. You can break into in with gdb, and then do "next"-stepping until error (this will be more beautiful and easier with debug build of LLVM); or you can add some debugging printf's into the LLVMCreateDisasmCPU or temporary change return value from plain NULL info some different for each of error.

UPDATE: Seems that your LLVM was not initialized at time of call. There are many LLVM Initializers in the llvm-c/Target.h header in current LLVM (~3.4 or newer):

LLVMInitializeAllTargetInfos() - The main program should call this function if it wants access to all available targets that LLVM is configured to support.

LLVMInitializeAllTargets() - The main program should call this function if it wants to link in all available targets that LLVM is configured to support.

LLVMInitializeAllTargetMCs() - The main program should call this function if it wants access to all available target MC that LLVM is configured to support.

LLVMInitializeAllDisassemblers() - The main program should call this function if it wants all disassemblers that LLVM is configured to support, to make them available via the TargetRegistry.

LLVMInitializeAllAsmPrinters() - The main program should call this function if it wants all asm printers that LLVM is configured to support, to make them available via the TargetRegistry.

and so on (https://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Target.h?logsort=rev&diff_format=h&r1=192697&r2=192696&pathrev=192697).

There is even LLVMInitializeNativeTarget function which initializes native target:

LLVMInitializeNativeTarget() - The main program should call this function to initialize the native target corresponding to the host. This is useful for JIT applications to ensure that the target gets linked in correctly.

like image 128
osgx Avatar answered Oct 23 '22 06:10

osgx