Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using LLVM as virtual machine - multiplatform and multiarchitecture coding

I'm currently working in a pet programming language (for learning purposes), and have gone through a lot of research over the past year, and I think its time to finally start modelling the concepts of such a languague. First of all I want it to compile to some intermediate form, such as JVM or .NET bytecode, the goal being multiplatform/architecture compatibily. Second, I want it to be fast (I also have many other things in mind, but its not the purpose of this topic to discuss those).

The best options that came to my mind were: Compile to JVM bytecode and use OpenJDK as runtime environment, Compile to .NET bytecode and use Mono as runtime environment, Compile to LLVM IR and use LLVM as runtime environment.

As you may have imagined, I've chosen LLVM. Why? because its blazing fast. I did a little benchmark using the C++ N-Body code, and achieved 7s in my machine with lli jitted IR, in contrast with 27s with clang native compiled code (I know clang first make IR then machine code).

So, here is my question: Is there any redistributable version of the LLVM basic toolset (I just need lli) that I can use? Or I must compile my own? If the latter, can you provide me with any hints on how to do it? If I really must do it, I'm thinking is cross-compiling them from my machine (Intel Mac), and generating some installers (say, an .msi for windows, .rpm and .deb for popular linux distros and .pkg for Macs). Remember, I only need a minimal subset of LLVM, such that this subset is capable of acting like a VM, by using "lli ". The real question here is how to use LLVM as a typical virtual machine.

like image 733
Salvia Avatar asked May 02 '13 22:05

Salvia


People also ask

Is LLVM a virtual machine?

LLVM is an acronym that stands for low level virtual machine. It also refers to a compiling technology called the LLVM project, which is a collection of modular and reusable compiler and toolchain technologies.

Does LLVM compile to machine code?

LLVM is a language-agnostic compiler toolchain that handles program optimization and code generation. It is based on its own internal representation, called LLVM IR, which is then transformed into machine code.

Is LLVM like JVM?

The biggest difference between JVM bytecode and and LLVM bitcode is that JVM instructions are stack-oriented, whereas LLVM bitcode is not. This means that rather than loading values into registers, JVM bytecode loads values onto a stack and computes values from there.

Is LLVM a programming language?

LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture.


1 Answers

First, I think all 3 options - LLVM IR + LLVM, Java Bytecode + OpenJDK, and .NET CIL + Mono - are excellent options, and I agree deciding between them is not easy.

If you go for LLVM and you just want to use lli, you can compile LLVM to your target platform and pack the resulting lli executable with your distribution, it should work.

Another way to write a JIT compiler via LLVM is to use an execution engine - see the handy examples in the Kaleidoscope tutorial. That means that you write your own program which will JIT-compile your own language, compile it to whatever platform you want while statically linking it with LLVM, and then distribute it.

In any case, since a JIT compiler requires copying an LLVM binary to the client side, make sure to attach a copyright notice with your distribution (you don't have to open-source your distribution, though).

like image 87
Oak Avatar answered Sep 25 '22 03:09

Oak