Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging LLVM IR

Tags:

llvm

I've built an LLVM-targetting frontend which produces some IR. Subsequently and entirely expectedly, the IR output is incorrect in some cases (as in, it appears correct, but the resulting program crashes when executed). However, I haven't found many useful tools to deal with this.

I have tried using lli, but the error message output is spectacularly unhelpful (when you would assume that an interpreter could give very precise error details).

I looked into converting the IR to C code, and then debugging that with Visual Studio, but it seems this functionality was removed from LLVM.

I also looked into dealing with GDB. However, the DWARF debug information format is quite specific to a few existing languages it seems, and in addition, the source that I am translating with my frontend is correct, it's the produced IR which is wrong, so debug symbols for the original source wouldn't be too helpful- for example, I'd need to see the values of a bunch of intermediate register values that don't correspond to any source variable, or breakpoint in compiler-generated functions.

What tools and techniques exist to debug LLVM IR output?

like image 409
Puppy Avatar asked May 28 '13 18:05

Puppy


2 Answers

I'm not sure I understand your problem fully. Are you saying that your compiler (from language X to LLVM IR) is producing incorrect output (incorrect LLVM IR) and you're not sure how to debug it? In other words, there are two possibilities:

  1. The IR produced by your compiler is incorrect - you can point at some instruction(s) and say - this is not what I meant to generate.
  2. The IR seems correct but doesn't produce the results I expected it to produce.

I assume it's (1) you're talking about (because this is what the question was saying, before you updated it)

This wouldn't be a LLVM-specific problem, then. Assume you're writing a compiler from language X to native code. The produced native code is incorrect - how do you debug the problem? Well, you debug your compiler, obviously. You try to find the last place where the compiler's understanding of the input was correct, or the first place where it became incorrect. How you do this depends hugely on the architecture of your compiler. However, something that helps a lot is having a printable representation of other intermediate layers in your compiler.

For example, Clang (which produces LLVM IR from C, C++ and Objective C) can dump its full AST. So looking at the AST for the incorrect code can cut the compiler in half, helping determine if the problem is in the front-end (C source -> AST) or code gen (AST -> LLVM IR). The LLVM backend (compiles LLVM IR to native code) also has a few intermediate layers (most notably SelectionDAG and MIs), which can be examined for the sake of debugging. These are just examples of other existing compilers, YMMV with yours.

like image 191
Eli Bendersky Avatar answered Nov 08 '22 13:11

Eli Bendersky


Will Diez described how he implemented that:
https://groups.google.com/d/msg/llvm-dev/O4Dj9FW1gtM/ovnm6dqoJJsJ

Hi all,

For my own purposes, I wrote a pass that does exactly what you all are describing: add debug metadata to LLVM IR.

As a pass, it had to tackle the problem of "This file needs to exist on disk somewhere so gdb can find it", which I solved my dumping it onto /tmp/ somewhere. Not a great solution (who deletes these?) but worked well enough.

Another interesting issue is how to coexist with any existing debug metadata, which can be useful for simultaneously debugging an IR transform inline with the C source for instrumentation-style passes like SAFECode, ASan/TSan.

Quick Example:

(gdb) break main
Breakpoint 1 at 0x4010b1: file
/home/wdietz2/magic/test/unit/test_loop.c, line 9.
(gdb) r
Starting program:
/home/wdietz2/llvm/32-obj-make/projects/magic/test/Output/test_loop

Breakpoint 1, main (argc=<value optimized out>, argv=<value optimized
out>) at /home/wdietz2/magic/test/unit/test_loop.c:9
9         unsigned k = 0;
Missing separate debuginfos, use: debuginfo-install
glibc-2.12-1.80.el6_3.5.x86_64 libgcc-4.4.6-4.el6.x86_64
libstdc++-4.4.6-4.el6.x86_64
(gdb) n
10        source(argc != 0, &k);
(gdb) n
14        %and.i.i.i.i104 = and i64 %4, 70368744177660
(gdb) n
15        %5 = load i8** @global, align 8
(gdb) n
18        store i32 16843009, i32* %6, align 1
(gdb) n
19        store i8 1, i8* getelementptr inbounds ([1 x i8]* @array,
i64 0, i64 0), align 1
(gdb) n
20        call coldcc void @runtime_func() nounwind
(gdb) n
11        while(i-- > argc)
(gdb) n
23        %and.i.i.i.i85 = and i64 %7, 70368744177660
(gdb) n
14          while(j++ < i) k += j;
(gdb) n
11        while(i-- > argc)
(gdb) n
14          while(j++ < i) k += j;
(gdb) n
102       %77 = load i8** @global, align 8
(gdb) n
105       %79 = load i32* %78, align 4
(gdb) n
106       %cmp7.i.i.i = icmp ne i32 %79, 0
(gdb) n
108       call void @llvm.memset.p0i8.i64(i8* %add.ptr.i.i.i.i86, i8
%conv8.i.i.i, i64 4, i32 1, i1 false) nounwind
(gdb) n
14          while(j++ < i) k += j;
(gdb) n
15          while(j-- > 0) k *= k + j;
(gdb) n
95        %69 = load i8** @global, align 8
(gdb) n
98        %71 = load i32* %70, align 4
(gdb)

The pass itself is rather simple--the hard problem it solves is emitting the IR to disk and reasoning about what Instruction* is on what line, which really shouldn't be a problem if done properly in LLVM. If desired I can certainly make the code available on request.

In short, it seemed to work well for me and having it done properly in LLVM itself would be great!

Unfortunately, it seems that the code is not available.

like image 2
Joachim Breitner Avatar answered Nov 08 '22 11:11

Joachim Breitner