I'm trying to learn my way around the LLVM infrastructure. I've installed the LLVM binaries for Windows on a MinGW installation.
I'm following the tutorial found on the LLVM site about the so-called Kaleidoscope language. I have a source file that has exactly the code listing at the end of this page.
Also, if it's of any importance, I'm building using the following flags (obtained through llvm-config
ahead of time, because the Windows shell doesn't have very comfortable substitution syntax):
clang++ -g -O3 kaleido.cpp -o kaleido.exe -IC:/MinGW/include -DNDEBUG -D__NO_CTYPE_INLINE -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -LC:/MinGW/lib -lLLVMCore -lLLVMSupport -lpthread -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMMCParser -lLLVMX86Desc -lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils -lLLVMJIT -lLLVMRuntimeDyld -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMObject -lLLVMCore -lLLVMSupport -lm -limagehlp -lpsapi
Using that proposed language implemented in the linked code, I'm testing a few top level expressions. First, one with literals:
ready> 5 + 3;
ready> Read top-level expression:
define double @0() {
entry:
ret double 8.000000e+00
}
Evaluated to 8.000000
...Works as expected. Then a function definition with a constant result:
ready> def f(x) 12;
ready> Read function definition:
define double @f(double %x) {
entry:
ret double 1.200000e+01
}
...Again, working as expected. Calling this for any input gives a fixed result:
ready> f(5);
ready> Read top-level expression:
define double @1() {
entry:
%calltmp = call double @f(double 5.000000e+00)
ret double %calltmp
}
Evaluated to 12.000000
...No surprise. Then, a function definition that does something with the parameter:
ready> def g(x) x + 1;
ready> Read function definition:
define double @g(double %x) {
entry:
%addtmp = fadd double 1.000000e+00, %x
ret double %addtmp
}
...Looks like it's okay, the bytecode is generated. Now, calling it:
ready> g(5);
ready> Read top-level expression:
define double @2() {
entry:
%calltmp = call double @g(double 5.000000e+00)
ret double %calltmp
}
0x00D400A4 (0x0000000A 0x00000000 0x0028FF28 0x00D40087) <unknown module>
0x00C7A5E0 (0x01078A28 0x010CF040 0x0028FEF0 0x40280000)
0x004023F1 (0x00000001 0x01072FD0 0x01071B10 0xFFFFFFFF)
0x004010B9 (0x00000001 0x00000000 0x00000000 0x00000000)
0x00401284 (0x7EFDE000 0x0028FFD4 0x77E59F42 0x7EFDE000)
0x75693677 (0x7EFDE000 0x7B3361A2 0x00000000 0x00000000), BaseThreadInitThunk() + 0x12 bytes(s)
0x77E59F42 (0x0040126C 0x7EFDE000 0x00000000 0x00000000), RtlInitializeExceptionChain() + 0x63 bytes(s)
0x77E59F15 (0x0040126C 0x7EFDE000 0x00000000 0x78746341), RtlInitializeExceptionChain() + 0x36 bytes(s)
...Crashes.
Through some rudimentary debugging, I've come to believe that the involved pieces of code, meaning the one for the top-level expression (the call to g(x)
with an argument of 5) and for the one for the called function, are both JIT-compiled successfully. I believe this is the case because I get the function pointer before the crash (and I'm assuming the execution engine returns a function pointer only after it has successfully compiled the function). To be more precise, the crash happens exactly at the point where the function pointer is run, meaning this line in my source file (in HandleTopLevelExpression()
):
fprintf(stderr, "Evaluated to %f\n", FP());
Most probably the line itself is innocent, because it runs successfully for other functions. The culprit is likely somewhere inside the function pointed by FP
in the last of the above examples, but since that code is runtime generated, I don't have it in my cpp
file.
Any ideas on why it might be crashing on that specific scenario?
UPDATE #1: Running the process through gdb shows this at the crash point:
Program received signal SIGILL, Illegal instruction.
And a trace that doesn't tell me anything:
0x00ee0044 in ?? ()
UPDATE #2: In an attempt to shed some more light on this, here's the assembly around the crash:
00D70068 55 PUSH EBP
00D70069 89E5 MOV EBP,ESP
00D7006B 81E4 F8FFFFFF AND ESP,FFFFFFF8
00D70071 83EC 08 SUB ESP,8
00D70074 C5FB LDS EDI,EBX ; Here! ; Illegal use of register
00D70076 1045 08 ADC BYTE PTR SS:[EBP+8],AL
00D70079 C5FB LDS EDI,EBX ; Illegal use of register
00D7007B 58 POP EAX
00D7007C 05 6000D700 ADD EAX,0D70060
00D70081 C5FB LDS EDI,EBX ; Illegal use of register
00D70083 110424 ADC DWORD PTR SS:[ESP],EAX
00D70086 DD0424 FLD QWORD PTR SS:[ESP]
00D70089 89EC MOV ESP,EBP
00D7008B 5D POP EBP
00D7008C C3 RETN
The crash is happening at 00D70074
, the instruction being LDS EDI,EBX
. It is a few addresses higher than the address pointed by FP
(which makes me believe that this all might be JIT-emitted code, but please take this conclusion with a grain of salt, as I'm over my head here).
As you can see, the disassembler has also placed a comment on that and the next similar lines, saying it's an illegal use of the register. To be honest, I don't know why this specific extended register pair is illegal for this instruction, but if it is illegal, why is it there at all and how can we make the compiler produce legal code?
Apparently LLVM is generating VEX-prefixed AVX instructions for you, but your processor doesn't support that instruction set (and neither does your disassembler).
AVX-aware decoding of your JIT bytes give the following valid code:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: 81 e4 f8 ff ff ff and esp,0xfffffff8
9: 83 ec 08 sub esp,0x8
c: c5 fb 10 45 08 vmovsd xmm0,QWORD PTR [ebp+0x8]
11: c5 fb 58 05 60 00 d7 vaddsd xmm0,xmm0,QWORD PTR ds:0xd70060
18: 00
19: c5 fb 11 04 24 vmovsd QWORD PTR [esp],xmm0
1e: dd 04 24 fld QWORD PTR [esp]
21: 89 ec mov esp,ebp
23: 5d pop ebp
24: c3 ret
If LLVM is misdetecting your native architecture, or if you just want to override it, you can change the EngineBuilder
used in the sample code, for example like:
TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).setMCPU("i386").create();
You can also set the architecture or provide attributes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With