Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using LLVM bytecode for libraries (instead of native object files)

What are the implications on

  • portability (calling convention: does it really matter at an LLVM level when only calling into C or OS library functions)
  • link time
  • optimizations

I would like to compile a toy language with LLVM, due to all the hard parts already being present (optimization, object code generation), but am struggling with a concept I'd like to keep if it is worth it: library files should be redistributable, usable as static and shared lib (for linking, in the shared case a real so or dll will be generated when the final app is linked), portable. I believe this would cut part of compilation time (as the native code generation and perhaps optimization is only done once, at final binary link time). I envision the linker taking care of calling convention (if possible) and the conversion to a shared library if requested. In a far-stretched addition, perhaps LLVM could be leveraged to not link, and use the LLVM JIT to run the generated bytecode directly, completely removing link times when writing code.

Does this sound

  1. Doable?
  2. Worth it? I know C/C++ link time is comparatively long, which is problematic when frequently rebuilding. What about free link time optimization (cfr /GL and -flto as it will be essentially LLVM bytecode being linked together, which will then be turned into a native binary).

This may be a vague question, if I have to clarify something, please ask.

like image 735
rubenvb Avatar asked Feb 18 '12 14:02

rubenvb


1 Answers

I have done something similar to this in the past. One thing that you should realize is that LLVM bitcode is not "portable" in that it is not completely machine independent. Bitcode files have knowledge of things like the size of pointers, etc. that are specific to the processor being targeted.

Having said that, in the past I have compiled programs and their support libraries to bitcode and linked the bitcode files together before generating an assembly file for the whole program. You're right that calling conventions aren't important for calls that are internal but calls made outside (or from outside) still require that the ABI is followed.

You may be able to design your toy language in such a way that you can avoid processor dependent bit code, but you'll have to be very careful.

I noticed that linking the bitcode files together took quite a while, especially at high optimization levels. That may have speeded up by now, I did it with LLVM from 2 or 3 years ago.

One final point: depending on the target processor you'll probably need the equivalent of libgcc.a or compiler-rt to handle things that the processor can't like floating point or 64 bit integer stuff if the processor doesn't have instructions that perform those operations.

like image 83
Richard Pennington Avatar answered Sep 27 '22 21:09

Richard Pennington