Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assigning literal value to local variable in LLVM IR

I've been messing around with code generation to LLVM IR, and there's something I don't quite understand, which is the distinction of when local variables have to be allocated, and how to load a literal value into a local variable. If I compile the following simple C code to LLVM IR,

//test.c
int main() {
    int x = 3;
    int y = x + 4;
    return y;
}

I get this output:

; ModuleID = 'test.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"

; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
    %1 = alloca i32, align 4
    %x = alloca i32, align 4
    %y = alloca i32, align 4
    store i32 0, i32* %1
    store i32 3, i32* %x, align 4
    %2 = load i32* %x, align 4
    %3 = add nsw i32 %2, 4
    store i32 %3, i32* %y, align 4
    %4 = load i32* %y, align 4
    ret i32 %4
}

attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)"}

which I can trim down to

target triple = "x86_64-apple-macosx10.10.0"

define i32 @main() {
    %t1 = alloca i32
    store i32 3, i32* %t1
    %x = add nsw i32 0, 3

    %y = add nsw i32 %x, 4
    ret i32 %y
}

so I guess my question is, why is it such a roundabout method for loading a literal number into a variable? Is there a better/more direct way? Also, why does %t1 need to be alloca-ed, but %x and %y do not?

like image 520
Jumhyn Avatar asked Jul 07 '14 07:07

Jumhyn


1 Answers

Clang is responsible for generating the first code segment. Clang chooses the easiest way to generate IR for these instructions - which is to allocate memory for each one, and then store and load to this memory. This creates IR that can emulate the semantics of C variables, that can be reassigned different values throughout their lifetime - there's no such thing in LLVM IR (there are no variables - read more about SSA).

What Clang does is just the first step of compilation, though. This IR will then pass through numerous transformations (called "passes"). One will be responsible for getting rid of the memory use, as you have shown in the 2nd snippet - this will allow later using registers instead of the stack for these values. Another pass will get rid of the unused value %t1, another will identify that constants are being used here and will replace the entire function body with return i32 7... and so on.

So to sum it up, it's not a "roundabout way", it's just the easiest way for Clang to generate IR, and making the IR better is the responsibility of LLVM passes later on.

like image 155
Oak Avatar answered Oct 19 '22 09:10

Oak