Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Argument forwarding in LLVM

I need some advice on "forwarding" arguments to a callee (in the LLVM-IR).

Suppose I have a function F that is called at the beginning of all other functions in the module. From F I need to access (read) the arguments passed to its immediate caller.

Right now to do this I box all arguments in the caller inside a struct and pass a i8* pointer to the struct to F, alongside an identifier telling which caller F is being called from. F has then a giant switch that branches to the appropriate unboxing code. This must be done because the functions in the module have differing signatures (differing argument/return value count and types; even differing calling conventions), but it is obviously suboptimal (both from a performance and code size point-of-view) because I need to allocate the struct on the stack, copy the arguments inside of it, passing an additional pointer to F and then performing the unboxing.

I was wondering if there's a better way to do this, i.e. a way to access from a function the stack frame of its immediate caller (knowing, thanks to the identifier, which caller the function was called from) or, more in general, arbitrary values defined in its immediate caller. Any suggestions?

note: the whole point of what I'm working on is having a single function F that does all this; splitting/inlining/specializing/templating F is not an option.


to clarify, suppose we have the following functions FuncA and FuncB (note: what follows is just pseudo-C-code, always remember we are talking about LLVM-IR!)

Type1 FuncA(Type2 ArgA1) {
  F();
  // ...
}

Type3 FuncB(Type4 ArgB1, Type5 ArgB2, Type6 ArgB3) {
  F();
  // ...
}

what I need is an efficient way for the function F to do the following:

void F() {
  switch (caller) {
    case FuncA:
      // do something with ArgA1
      break;
    case FuncB:
      // do something with ArgB1, ArgB2, ArgB3
      break;
  }
}

as I explained in the first part, right now my F looks like this:

struct Args_FuncA { Type2 ArgA1 };
struct Args_FuncB { Type4 ArgB1, Type5 ArgB2, Type6 ArgB3 };

void F(int callerID, void *args) {
  switch (callerID) {
    case ID_FuncA:
      Args_FuncA *ArgsFuncA = (Args_FuncA*)args;
      Type2 ArgA1 = ArgsFuncA->ArgA1;
      // do something with ArgA1
      break;
    case ID_FuncB:
      Args_FuncB *ArgsFuncB = (Args_FuncB*)args;
      Type4 ArgB1 = ArgsFuncB->ArgB1;
      Type5 ArgB2 = ArgsFuncB->ArgB2;
      Type6 ArgB3 = ArgsFuncB->ArgB3;
      // do something with ArgB1, ArgB2, ArgB3
      break;
  }
}

and the two functions become:

Type1 FuncA(Type2 ArgA1) {
  Args_FuncA args = { ArgA1 };
  F(ID_FuncA, (void*)&args);
  // ...
}

Type3 FuncB(Type4 ArgB1, Type5 ArgB2, Type6 ArgB3) {
  Args_FuncB args = { ArgB1, ArgB2, ArgB3 };
  F(ID_FuncB, (void*)&args);
  // ...
}
like image 459
CAFxX Avatar asked Aug 22 '11 08:08

CAFxX


2 Answers

Not sure if this helps, but I had a similar problem and got around the limitations of LLVM's tbaa analysis by using a llvm vector to store the intermediate values. LLVM optimization passes were later able to optimize the vector load / stores into scalar registers.

There were a few caveats as I recall. Let me know if you explore this route and I can dig up some code.

like image 41
Mike Woodworth Avatar answered Nov 12 '22 03:11

Mike Woodworth


IMHO you've done it right. While there are solutions in machinecode assembly, I am afraid there might be no solution in LLVM assembly, as it's "higher level". If you'd like to run a function on the beginning of some functions have you thought about checking

  • debugger sources (like gdb)
  • Binary Instrumentation with Valgrind

I know it's not direct answer, but I hope it might be helpful in some way ;).

like image 135
Grzegorz Wierzowiecki Avatar answered Nov 12 '22 03:11

Grzegorz Wierzowiecki