Assume a simple partial evaluation scenario:
#include <vector>
/* may be known at runtime */
int someConstant();
/* can be partially evaluated */
double foo(std::vector<double> args) {
return args[someConstant()] * someConstant();
}
Let's say that someConstant()
is known and does not change at runtime (e.g. given by the user once) and can be replaced by the corresponding int literal. If foo
is part of the hot path, I expect a significant performance improvement:
/* partially evaluated, someConstant() == 2 */
double foo(std::vector<double> args) {
return args[2] * 2;
}
My current take on that problem would be to generate LLVM IR at runtime, because I know the structure of the partially evaluated code (so I would not need a general purpose partial evaluator).
So I want to write a function foo_ir
that generates IR code that does the same thing as foo
, but not calling someConstant()
, because it is known at runtime.
Simple enough, isn't it? Yet, when I look at the generated IR for the code above:
; Function Attrs: uwtable
define double @_Z3fooSt6vectorIdSaIdEE(%"class.std::vector"* %args) #0 {
%1 = call i32 @_Z12someConstantv()
%2 = sext i32 %1 to i64
%3 = call double* @_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %args, i64 %2)
%4 = load double* %3
%5 = call i32 @_Z12someConstantv()
%6 = sitofp i32 %5 to double
%7 = fmul double %4, %6
ret double %7
}
; Function Attrs: nounwind uwtable
define linkonce_odr double* @_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %this, i64 %__n) #1 align 2 {
%1 = alloca %"class.std::vector"*, align 8
%2 = alloca i64, align 8
store %"class.std::vector"* %this, %"class.std::vector"** %1, align 8
store i64 %__n, i64* %2, align 8
%3 = load %"class.std::vector"** %1
%4 = bitcast %"class.std::vector"* %3 to %"struct.std::_Vector_base"*
%5 = getelementptr inbounds %"struct.std::_Vector_base"* %4, i32 0, i32 0
%6 = getelementptr inbounds %"struct.std::_Vector_base<double, std::allocator<double> >::_Vector_impl"* %5, i32 0, i32 0
%7 = load double** %6, align 8
%8 = load i64* %2, align 8
%9 = getelementptr inbounds double* %7, i64 %8
ret double* %9
}
I see, that the []
was included from the STL definition (function @_ZNSt6vectorIdSaIdEEixEm
) - fair enough. The problem is: It could as well be some member function, or even a direct data access, I simply cannot assume the data layout to be the same everywhere, so at development-time, I do not know the concrete std::vector
layout of the host machine.
Is there some way to use C++ metaprogramming to get the required information at compile time? i.e. is there some way to ask llvm to provide IR for std::vector
's []
method?
As a bonus: I would prefer to not enforce the compilation of the library with clang, instead, LLVM shall be a runtime-dependency, so just invoking clang at compile time (even if I do not know how to do this) is a second-best solution.
Answering my own question:
While I still have no solution for the general case (e.g. std::map
), there exists a simple solution for std::vector
:
According to the C++ standard, the following holds for the member function data()
Returns a direct pointer to the memory array used internally by the vector to store its owned elements.
Because elements in the vector are guaranteed to be stored in contiguous storage locations in the same order as represented by the vector, the pointer retrieved can be offset to access any element in the array.
So in fact, the object-level layout of std::vector
is fixed by the standard.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With