LLVM Newbie here. I have the following C++ program
using namespace std;
struct A{
int i;
int j;
};
int main()
{
struct A obj;
obj.i = 10;
obj.j = obj.i;
return 0;
}
Using clang++, I can see that LLVM IR contains struct field as below
%struct.A = type { i32, i32 }
I would like to obtain the structure elements using LLVM Pass. I write the following program - that iterates through both global variables, and each of the Instruction operands, but none of them help me in extracting struct A or A.i or A.j.
#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
#include <llvm/IR/Constants.h>
#include <llvm/IR/DerivedTypes.h>
#include <llvm/IR/Instructions.h>
#include <llvm/IR/IntrinsicInst.h>
#include <llvm/IR/LLVMContext.h>
#include <llvm/IR/Module.h>
#include <iostream>
#include <map>
#include <vector>
using namespace llvm;
namespace {
class StructModulePass: public ModulePass {
public:
static char ID;
StructModulePass() : ModulePass(ID) {}
virtual bool runOnModule(Module &M1) override {
// iterate over global structures
M = &M1;
int i;
for(auto G = M->global_begin(); G!= M->global_end() ; G++, i++){
errs() << i << " == > " ;
errs().write_escaped(G->getName()) << "\n";
}
// iterate through each instruction. module->function->BB->Inst
for(auto &F_ : M->functions()){
F = &F_;
for(auto &B_ : *F)
B = &B_;
for(auto &I : *B) {
for (unsigned i = 0; i < I.getNumOperands(); i++)
std::cerr << I.getOperand(i)->getName().data() << std::endl;
}
}
return true;
}
private:
Module *M;
Function *F;
BasicBlock *B;
};
}
char StructModulePass:: ID = 0;
static RegisterPass<StructModulePass> X("getstructnamesize", "Get All Struct Names and Sizes",
false /* Only looks at CFG */ ,
false /* Analysis Pass */);
I want to create a database of all structures (global and local) defined and being used in my program. Eg. < A , <int32, int32> , B , <int32, bool , char *>>.
I have gone through doxygen pages, LLVM tutorials and checked if we can get the struct values, but I am unable to find a way to extract the structures without already knowing the struct values - eg. creating an IRBuilder, inserting predefined IntTy32 type variables. Any help in this regard or some relevant tutorials will help
In LLVM IR terminology, a "global" is a global variable or global constant. This line:
%struct.A = type { i32, i32 }
Is an identified structure specification, not a global variable, just like how typedef
in C++ is not a global variable. You can iterate over those using Module::getIdentifiedStructTypes()
.
Some notes, however:
Get familiar with the dump()
method. It's a far easier alternative to all your prints to cerr
.
You're using getName()
on values, not on types - I don't think that's what you meant to do. Also keep in mind LLVM values do not necessarily have names.
Getting out results like <int32, bool, char *>
- which are C++ types, not LLVM IR types - will be trickly. For instance, Clang will probably compile both bool
and char
to i8
, and it won't be easy to tell the difference. You might also get vptr field, padding fields, etc. If you really do want the actual C++ structure of structs used in the source program, you have to rely on debug info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With