I'm writing a compiler for a simple C-like language for a course I'm taking. This bit of code:
int main() {
printInt(not(0));
return 0;
}
int not(int n) {
if (n == 0) {
return 1;
} else {
int result = 0;
return result;
}
}
..I naively compile to this bitcode:
declare void @printInt(i32)
declare void @printDouble(double)
declare void @printString(i8*)
declare i32 @readInt()
declare double @readDouble()
define i32 @main() {
entry:
%0 = call i32 @not(i32 0)
call void @printInt(i32 %0)
ret i32 0
unreachable
}
define i32 @not(i32 %n_0) {
entry:
%0 = icmp eq i32 %n_0, 0
br i1 %0, label %lab0, label %lab1
lab0:
ret i32 1
br label %lab2
lab1:
%result_0 = alloca i32
store i32 0, i32* %result_0
%1 = load i32* %result_0
ret i32 %1
br label %lab2
lab2:
unreachable
}
However, opt does not accept that code.
opt: core023.ll:25:5: error: instruction expected to be numbered '%2'
%1 = load i32* %result_0
Now, from what I understand of unnamed temporary registers they're supposed to be numbered sequentially starting from 0. Which is the case here. But apparently the "%1 = sub.." line should have been numbered %2. Why is that? Do any of the instructions between %0 and %1 increase the sequence number? Or maybe it's just a follow-on fault from something else?
In LLVM, everything that can have a name but does not is assigned a number. This also includes basic blocks. In your case
lab0:
ret i32 1
br label %lab2
defines two basic blocks because every terminator instruction ends a basic block. This means that, conceptually, your code is parsed as
lab0:
ret i32 1
1:
br label %lab2
and the next free number after that is 2.
To prevent strange behavior like this, I recommend always explicitly naming basic blocks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With