Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Traversal of LLVM Operands

Using a ModulePass, my goal is to traverse a SSA graph upwards: going from one Statement with 0..2 operands (most opcodes fall under that), I want to find out two things:

  1. Is an operand a metadata / constant (easy: just try casting to Constant-Type) or a variable?
  2. If it is a variable, get me the Statement where it is defined (as LLVM IR is in SSA form, this is a well-formed query), so I can recursively continue this traversal.

As an example, assume following LLVM IR:

define i32 @mul_add(i32 %x, i32 %y, i32 %z) {
entry:
  %tmp = mul i32 %x, %y
  %tmp2 = add i32 %tmp, %z
  ret i32 %tmp2
}

I will be starting with the return statement. Now I want to find out what I'm returning:

  1. I will get myself the operands of the return statement
  2. I detect that the operand is a variable called %tmp2
  3. I will get myself the operands of the statement where %tmp2 is defined
  4. I will first traverse the first operand, %tmp
  5. (...)
  6. %x is a function parameter and therefore probably the end of my traversal (or is it?)
  7. (... continue with the other branches in this DFS ...)

How would I use the C++ API to implement those steps?

like image 459
5-to-9 Avatar asked Jul 06 '17 10:07

5-to-9


1 Answers

The solution is simple and I will describe it as generic as possible.

Each Operand of an llvm::Instruction in LLVM is of supertype llvm::Value. A subtype of Value is llvm::Instruction. This allows recursion via following methods:

bool runOnInstruction(llvm::Instruction *instruction) {
  bool flag = false;
  for (auto operand = instruction->operands().begin();
            operand != instruction->operands().end(); ++operand) {
    printOperandStats(operand->get());
    flag = runOnOperand(operand->get()) | flag;
  }
  return flag;
}

bool runOnOperand(llvm::Value *operand) {
  operand->printAsOperand(errs(), true);
  // ... do something else ... 
  auto *instruction = dyn_cast<llvm::Instruction>(operand);
  if (nullptr != instruction) {
    return runOnInstruction(instruction);
  } else {
    return false;
  }
}

This equals a DFS as required by the question. Each Operand will be cast to an Instruction and will be recursively analyzed. The boolean return value is used as usual for LLVM Passes: a value of true describes a modification to the IR.

like image 62
5-to-9 Avatar answered Sep 29 '22 07:09

5-to-9