I want to design a IR (like LLVM IR) for my toy compiler and I don't know
what is the purpose of the alloca
instruction in further analysis. In which optimizations alloca
informations are used?
A developer uses the API to generate instructions in a format called an intermediate representation, or IR. LLVM can then compile the IR into a standalone binary or perform a JIT (just-in-time) compilation on the code to run in the context of another program, such as an interpreter or runtime for the language.
LLVM IR is a low-level intermediate representation used by the LLVM compiler framework. You can think of LLVM IR as a platform-independent assembly language with an infinite number of function local registers.
Tradeoff #4: LLVM and poor LLVM IR generationrustc uses LLVM to generate code. LLVM can generate very fast code, but it comes at a cost. LLVM is a very big system. In fact, LLVM code makes up the majority of the Rust codebase.
The purpose of the alloca
instruction is to make generating code for imperative languages with mutable variables easier. Without it you would need to structure all your assignments in SSA form. And with mutable variables this can become quite complex.
Let's look at this simple C program (taken from the LLVM docs) and focus on the variable X and it's assignments. As you can notice, its final value (which we return) depends on the if-branch taken in the program.
int G, H;
int test(_Bool Condition) {
int X;
if (Condition)
X = G;
else
X = H;
return X;
}
Translating the example program to LLVM IR without alloca
Now, if we were to translate this program to LLVM IR, we would get something like:
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
br label %cond_next
cond_false:
%X.1 = load i32* @H
br label %cond_next
cond_next:
%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
ret i32 %X.2
}
Because there are two different possible values for X before the return instruction we have to insert a Phi node to merge the two values. Why is that? Well, because LLVM requires all assignments to be in SSA form and the Phi node is the only way to merge two values. However, such SSA construction with Phi nodes requires non-trivial algorithms and is inconvenient and wasteful to re-implement for each compiler.
So, you might wonder how do we solve this problem? And the answer, as you might have already guessed, is alloca
!
Translating the example program to LLVM IR with alloca
The ‘trick’ here is that while LLVM does require all register values to be in SSA form, it does not require (or permit) memory objects to be in SSA form. With this in mind, the high-level idea is that we want to make a stack variable (which lives in memory) for each mutable object in a function. Using that, our code now becomes:
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
%X = alloca i32 ; type of %X is i32*.
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
store i32 %X.0, i32* %X ; Update X
br label %cond_next
cond_false:
%X.1 = load i32* @H
store i32 %X.1, i32* %X ; Update X
br label %cond_next
cond_next:
%X.2 = load i32* %X ; Read X
ret i32 %X.2
}
With this, we have discovered a way to handle arbitrary mutable variables without the need to create Phi nodes at all!
Further (for the curious):
While this solution has solved our immediate problem, it introduced another one: we have now apparently introduced a lot of stack traffic for very simple and common operations, a major performance problem. Fortunately for us, the LLVM optimizer has a highly-tuned optimization pass named “mem2reg” that handles this case, promoting allocas like this into SSA registers, inserting Phi nodes as appropriate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With