Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create LLVM bytecode from C++ classes

I'm writing a compiler for a special purpose language in LLVM. I want to add bindings for a library that is already written in C++. My idea is to compile the library to LLVM bytecode (using clang -emit-llvm -S abc.c) and link it during compilation. This works well for code like

// lib.c
int f() {
    return 123;
}

But parts of the library are written like

// A.cc
class A {
    public:
        int f() { return 123; }
};

Which results in empty bytecode files. I know that I can fix this by separating the implementation:

// A.cc
class A {
    public:
        int f();
};

int A::f() {
    return 123;
}

But that would be a lot of tedious work. Is there any way to create useful bytecode from my library sources as they are? Or any other way to make the library available in my compiler?

like image 440
Thomas Schaub Avatar asked Jun 23 '11 11:06

Thomas Schaub


People also ask

Can GCC compile LLVM?

The llvm-gcc command is the LLVM C front end. It is a modified version of gcc that compiles C/ObjC programs into native objects, LLVM bitcode or LLVM assembly language, depending upon the options. By default, llvm-gcc compiles to native objects just like GCC does.

Is LLVM written in C?

The typical way to work with LLVM is via code in a language you're comfortable with (and that has support for LLVM's libraries, of course). Two common language choices are C and C++. Many LLVM developers default to one of those two for several good reasons: LLVM itself is written in C++.

Is LLVM better than GCC?

While LLVM and GCC both support a wide variety languages and libraries, they are licensed and developed differently. LLVM libraries are licensed more liberally and GCC has more restrictions for its reuse. When it comes to performance differences, GCC has been considered superior in the past.

Is LLVM a bytecode?

What is commonly known as the LLVM bitcode file format (also, sometimes anachronistically known as bytecode) is actually two things: a bitstream container format and an encoding of LLVM IR into the container format. The bitstream format is an abstract encoding of structured data, very similar to XML in some ways.


1 Answers

You could see whether clang honours external linkage for explicit template instantiations. This might apply to non-templates, but otherwise you could 'force it' to work for templates.

Simple synopsis:

lib1.h

template <typename T=int>
struct ATemplate { T f() { return 123; } };

add a file lib1_instantiate.cpp

#include "lib1.h"
template struct ATemplate<int>;
template struct ATemplate<unsigned int>;
template struct ATemplate<long>; // etc.

This should instantiate the named templates with external linkage.

If you're stuck with a non-template class, and the trick above doesn't work for that, you might wrap it like so:

instantiate.cpp:

namespace hidden_details
{
    template <class libtype> struct instantiator : public libtype 
    // derives... just do something that requires a complete type (not a forward!)
    { };
}

template struct hidden_details::instantiator<A>;

If you're out of luck you'll have to 'use' the inline members for them to get external linkage. A common trick is to use the address of these members (you won't need to implement delegating stuff):

instantiate.cpp:

static void force_use_A()
{
    void* unused = (void*) &A::f;
}

However

  1. the conversion to (void*) invokes undefined behaviour (you can't compile that with -pedantic -Werror on gcc)
  2. for overloads you'll have to specify ugly casts to disambiguate them

HTH

like image 115
sehe Avatar answered Oct 21 '22 13:10

sehe