Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Instrumenting C/C++ code using LLVM

I want to write a LLVM pass to instrument every memory access. Here is what I am trying to do.

Given any C/C++ program (like the one given below), I am trying to insert calls to some function, before and after every instruction that reads/writes to/from memory. For example consider the below C++ program (Account.cpp)

#include <stdio.h>

class Account {
int balance;

public:
Account(int b)
{
   balance = b;   
}
~Account(){ }

int read() 
{
  int r;  
  r = balance;   
  return r;
}

void deposit(int n) 
{   
  balance = balance + n;   
}

void withdraw(int n) 
{
  int r = read();   
  balance = r - n;   
}
};

int main ()
{ 
  Account* a = new Account(10); 
  a->deposit(1);
  a->withdraw(2);  
  delete a; 
}

So after the instrumentation my program should look like :

#include <stdio.h>

class Account 
{
  int balance;

public:
Account(int b)
{
  balance = b;   
}
~Account(){ }

int read() 
{
  int r;  
  foo();
  r = balance;
  foo();   
  return r;
}

void deposit(int n) 
{ 
  foo(); 
  balance = balance + n;
  foo();   
}

void withdraw(int n) 
{
  foo();
  int r = read();
  foo();
  foo();   
  balance = r - n;
  foo();   
}
};

int main ()
{ 
  Account* a = new Account(10); 
  a->deposit(1);
  a->withdraw(2);  
  delete a; 
}

where foo() may be any function like get the current system time or increment a counter .. so on.

Please give me examples (source code, tutorials etc) and steps on how to run it. I have read the tutorial on how make a LLVM Pass given on http://llvm.org/docs/WritingAnLLVMPass.html, but couldn't figure out how write a pass for the above problem.

like image 825
Himanshu Shekhar Avatar asked Oct 18 '11 11:10

Himanshu Shekhar


2 Answers

I'm not very familiar with LLVM, but I am a bit more familiar with GCC (and its plugin machinery), since I am the main author of GCC MELT (a high level domain specific language to extend GCC, which by the way you could use for your problem). So I will try to answer in general terms.

You should first know why you want to adapt a compiler (or a static analyzer). It is a worthwhile goal, but it does have drawbacks (in particular, w.r.t. redefining some operators or others constructs in your C++ program).

The main point when extending a compiler (be it GCC or LLVM or something else) is that you very probably should handle all its internal representation (and you probably cannot skip parts of it, unless you have a very narrow defined problem). For GCC it means to handle the more than 100 kinds of Tree-s and nearly 20 kinds of Gimple-s: in GCC middle end, the tree-s represent the operands and declarations, and the gimple-s represent the instructions. The advantage of this approach is that once you've done that, your extension should be able to handle any software acceptable by the compiler. The drawback is the complexity of compilers' internal representations (which is explainable by the complexity of the definitions of the C & C++ source languages accepted by the compilers, and by the complexity of the target machine code they are generating, and by the increasing distance between source & target languages).

So hacking a general compiler (be it GCC or LLVM), or a static analyzer (like Frama-C), is quite a big task (more than a month of work, not a few days). To deal only with a tiny C++ programs like you are showing, it is not worth it. But it is definitely worth the effort if you plain to deal with large source software bases.

Regards

like image 100
Basile Starynkevitch Avatar answered Sep 25 '22 15:09

Basile Starynkevitch


Try something like this: ( you need to fill in the blanks and make the iterator loop work despite the fact that items are being inserted )

class ThePass : public llvm::BasicBlockPass {
  public:
  ThePass() : BasicBlockPass() {}
  virtual bool runOnBasicBlock(llvm::BasicBlock &bb);
};
bool ThePass::runOnBasicBlock(BasicBlock &bb) {
  bool retval = false;
  for (BasicBlock::iterator bbit = bb.begin(), bbie = bb.end(); bbit != bbie;
   ++bbit) { // Make loop work given updates
   Instruction *i = bbit;

   CallInst * beforeCall = // INSERT THIS
   beforeCall->insertBefore(i);

   if (!i->isTerminator()) {
      CallInst * afterCall = // INSERT THIS
      afterCall->insertAfter(i);
   }
  }
  return retval;
}

Hope this helps!

like image 45
user835943 Avatar answered Sep 24 '22 15:09

user835943