Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I compile a Rust program with a custom llc?

Tags:

rust

llvm

I have a custom LLVM backend and would like to cross compile Rust for that custom (nostd) target. I'd like to compile Rust programs in two steps:

  1. Using rustc to generate LLVM IR.
  2. Use my own opt and llc to transform LLVM IR into machine code.

I tried using cargo rustc -- --emit=llvm-ir. I get .ll files, and then use llc to get .o files. Then I cross compile libcore in the same way. When I try to link all the objects together, it tells me about an undefined reference. I was using the same commit of libcore and rustc. This seems a problem with LLVM versions but I'm not sure.

like image 266
Hoblovski Avatar asked Oct 22 '18 07:10

Hoblovski


1 Answers

There's a couple things you should be aware of. Most importantly, the version of LLVM that rustc uses by default if you get it from rustup or a distro package manager is /not/ an actual release of LLVM, and may not actually be bitcode-compatible with a particular llvm release. We solved this issue in my project by building rust from source with the --llvm-root flag to configure. You can then use rustup toolchain link to link your built rustc into a custom rustup toolchain.

Second, you can make rustc emit .rlib files that contain llvm bitcode instead of machine code if you use at least rustc 1.34 and pass the -C linker-plugin-lto flag to rustc. I also wrote the following script that can unpack an rlib file containing object code and pack it back up as an rlib file containing llvm bitcode, if the above approach does not work for you.

#!/bin/bash
dir="$(mktemp -d)"
trap "rm -rf $dir" INT TERM EXIT
archive=$(realpath -m $1)
cd "$dir"
ar x "$archive"
rm ./*.rcgu.o
for file in *.bc.z; do
len=`od -An -t u4 -j 15 -N4 $file`
blen=`od -An -t u8 -j $((len+19)) -N8 $file`
tail -c+$((len+28)) $file | head -c $blen > $file.bc.gz
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" |cat - $file.bc.gz |gzip -dc > ${file%.bc.z}.o
done
rm *.bc.z
rm *.gz
rm "$archive"
llvm-ar rs "${archive}" ./*

Once you have the rlib files, you can use any llvm toolchain tool on them the same way you could with a .a file containing llvm bitcode.

In terms of performing the final link, there are a few things to bear in mind. First, rustc automatically generates the symbols __rust_alloc, __rust_alloc_zeroed, __rust_dealloc, and __rust_realloc and points them to either __rg_alloc (and similar __rg_ symbols respectively), which is the GlobalAlloc implementation that uses jemalloc by default, or __rdl_alloc (and similar __rdl_ symbols respectively), which is the system allocator powered by libc malloc. You will have to implement these symbols yourself if you are not using rustc to do the final link.

Second, libstd and libcore depend on some other libraries that you will also probably have to link against. Depending on what segment of the standard library you are using, you may find that different sets of libraries are required, so I can't help you without a specific error message there, but I can tell you that the list of libraries that my application ended up requiring was, in order: std, core, alloc, unwind, compiler_builtins, panic_abort, backtrace_sys, rustc_demangle. If you are using panic=unwind, you will obviously have to use that instead. If you find you still have missing symbols, I would suggest using nm to look for the library containing the missing symbol and figure out where it belongs in the linker order with trial and error.

Hope this helps, as I've spent a fair amount of effort engineering a solution to this exact problem (although not for the purposes of cross compilation).

like image 196
Dwight Guth Avatar answered Oct 13 '22 00:10

Dwight Guth