Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding tagged unions (sum types) in LLVM

Tags:

llvm

llvm-ir

I'm trying to encode a tagged union (also known as a sum type) in LLVM and it doesn't seem possible while keeping the compiler frontend platform agnostic. Imagine I had this tagged union (expressed in Rust):

enum T {
    C1(i32, i64),
    C2(i64)
}

To encode this in LLVM I need to know know the size of the largest variant. That in turn requires that I know the alignment and size of all fields. In other words, my frontend would need to

  • track the size and alignment of all things,
  • create a dummy struct (properly padded) to represent the biggest type that can fit any variant (e.g. {[2 x i64]}, assuming the tag can fit in the same word as the i32 field),
  • and finally either used packed structs or tell LLVM which "data layout" I assumed, so my computations matches LLVMs

What is the best way currently to encode tagged unions in LLVM?

like image 598
tibbe Avatar asked Oct 29 '22 18:10

tibbe


1 Answers

Conceptually, I don't think there's a better way than what you described, except that I wouldn't bother with using the constructed type at the declaration site, since actually accessing the union would be easiest to do through a bitcast anyway.

Here's a code snippet from Clang's getTypeExpansion(), showing it also does this - manually finding the largest field:

const FieldDecl *LargestFD = nullptr;
CharUnits UnionSize = CharUnits::Zero();

for (const auto *FD : RD->fields()) {
  // Skip zero length bitfields.
  if (FD->isBitField() && FD->getBitWidthValue(Context) == 0)
    continue;
  assert(!FD->isBitField() &&
         "Cannot expand structure with bit-field members.");
  CharUnits FieldSize = Context.getTypeSizeInChars(FD->getType());
  if (UnionSize < FieldSize) {
    UnionSize = FieldSize;
    LargestFD = FD;
  }
}
if (LargestFD)
  Fields.push_back(LargestFD);
like image 135
Oak Avatar answered Nov 16 '22 12:11

Oak