Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LLVM jit and native

Tags:

jit

llvm

clang

I don't understand how LLVM JIT relates to normal no JIT compilation and the documentation isn't good.

For example suppose I use the clang front end:

  1. Case 1: I compile C file to native with clang/llvm. This flow I understand is like gcc flow - I get my x86 executable and that runs.
  2. Case 2: I compile into some kind of LLVM IR that runs on LLVM JIT. In this case the executable contains the LLVM runtime to execute the IR on JIT, or how does it work?

What is the difference between these two and are they correct? Does LLVM flow include support for both JIT and non JIT? When do I want to use JIT - does it make sense at all for a language like C?

like image 415
zaharpopov Avatar asked Aug 18 '10 05:08

zaharpopov


4 Answers

You have to understand that LLVM is a library that helps you build compilers. Clang is merely a frontend for this library.

Clang translates C/C++ code into LLVM IR and hands it over to LLVM, which compiles it into native code.

LLVM is also able to generate native code directly in memory, which then can be called as a normal function. So case 1. and 2. share LLVM's optimization and code generation.

So how does one use LLVM as a JIT compiler? You build an application which generates some LLVM IR (in memory), then use the LLVM library to generate native code (still in memory). LLVM hands you back a pointer which you can call afterwards. No clang involved.

You can, however, use clang to translate some C code into LLVM IR and load this into your JIT context to use the functions.

Real World examples:

  • Unladen Swallow Python VM
  • Rubinius Ruby VM

There is also the Kaleidoscope tutorial which shows how to implement a simple language with JIT compiler.

like image 155
ebo Avatar answered Oct 19 '22 06:10

ebo


First, you get LLVM bytecode (LLVM IR):

clang -emit-llvm -S -o test.bc test.c 

Second, you use LLVM JIT:

lli test.bc

That runs the program.

Then, if you wish to get native, you use LLVM backend:

llc test.bc

From the assembly output:

as test.S
like image 29
Ariel Avatar answered Oct 19 '22 06:10

Ariel


I am taking the steps to compile and run the JIT'ed code from a mail message in LLVM community.

[LLVMdev] MCJIT and Kaleidoscope Tutorial

Header file:

// foo.h
extern void foo(void);

and the function for a simple foo() function:

//foo.c
#include <stdio.h>
void foo(void) {
    puts("Hello, I'm a shared library");
}

And the main function:

//main.c
#include <stdio.h>
#include "foo.h"
int main(void) {
    puts("This is a shared library test...");
    foo();
    return 0;
}

Build the shared library using foo.c:

gcc foo.c -shared -o libfoo.so -fPIC

Generate the LLVM bitcode for the main.c file:

clang -Wall -c -emit-llvm -O3 main.c -o main.bc

And run the LLVM bitcode through jit (and MCJIT) to get the desired output:

lli -load=./libfoo.so main.bc
lli -use-mcjit -load=./libfoo.so main.bc

You can also pipe the clang output into lli:

clang -Wall -c -emit-llvm -O3 main.c -o - | lli -load=./libfoo.so 

Output

This is a shared library test...
Hello, I'm a shared library

Source obtained from

Shared libraries with GCC on Linux

like image 7
Sriram Murali Avatar answered Oct 19 '22 06:10

Sriram Murali


Most compilers have a front end, some middle code/structure of some sort, and the backend. When you take your C program and use clang and compile such that you end up with a non-JIT x86 program that you can just run, you have still gone from frontend to middle to backend. Same goes for gcc, gcc goes from frontend to a middle thing and a backend. Gccs middle thing is not wide open and usable as is like LLVM's.

Now one thing that is fun/interesting about llvm, that you cannot do with others, or at least gcc, is that you can take all of your source code modules, compile them to llvms bytecode, merge them into one big bytecode file, then optimize the whole thing, instead of per file or per function optimization you get with other compilers, with llvm you can get any level of partial to compilete program optimization you like. then you can take that bytecode and use llc to export it to the targets assembler. I normally do embedded so I have my own startup code that I wrap around that but in theory you should be able to take that assembler file and with gcc compile and link it and run it. gcc myfile.s -o myfile. I imagine there is a way to get the llvm tools to do this and not have to use binutils or gcc, but I have not taken the time.

I like llvm because it is always a cross compiler, unlike gcc you dont have to compile a new one for each target and deal with nuances for each target. I dont know that I have any use for the JIT thing is what I am saying I use it as a cross compiler and as a native compiler.

So your first case is the front, middle, end and the process is hidden from you you start with source and get a binary, done. The second case is if I understand right the front and the middle and stop with some file that represents the middle. Then the middle to end (the specific target processor) can happen just in time at runtime. The difference there is the backend, the real time execution of the middle language of case two, is likely different than the backend of case one.

like image 2
old_timer Avatar answered Oct 19 '22 06:10

old_timer