I have the following function:
double
neville (double xx, size_t n, const double *x, const double *y, double *work);
which performs Lagrange interpolation at xx
using the n
points stored in x
and y
. The work
array has size 2 * n
. Since this is polynomial interpolation, n
is in the ballpark of ~5, very rarely more than 10.
This function is aggressively optimized, and supposed to be called in tight loops. Profiling suggests that heap allocating the work array in the loop is bad. Unfortunately, I'm supposed to package this into a function-like class, and clients must be unaware of the work array.
For now, I use a template integer argument for the degree and std::array
to avoid dynamic allocation of the work
array:
template <size_t n>
struct interpolator
{
double operator() (double xx) const
{
std::array<double, 2 * n> work;
size_t i = locate (xx); // not shown here, no performance impact
// due to clever tricks + nice calling patterns
return neville (xx, n, x + i, y + i, work.data ());
}
const double *x, *y;
};
It would have been possible to store the work array as a mutable member of the class, but operator()
is supposed to be used concurrently by several threads. This version is OK provided you know n
at compile time.
Now, I need the n
parameter to be specified at run time. I am wondering about something like this:
double operator() (double xx) const
{
auto work = static_cast<double*> (alloca (n * sizeof (double)));
...
Some bells ring when using alloca
: I'm of course going to have a cap on n
to avoid the alloca
call to overflow (anyway it's quite stupid to use degree 100 polynomial interpolation).
I'm quite unconfortable with the approach however:
alloca
?Using alloca wastes very little space and is very fast. (It is open-coded by the GNU C compiler.) Since alloca does not have separate pools for different sizes of blocks, space used for any size block can be reused for any other size. alloca does not cause memory fragmentation.
The alloca() function allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed when the function that called alloca() returns to its caller.
alloca() is very useful if you can't use a standard local variable because its size would need to be determined at runtime and you can absolutely guarantee that the pointer you get from alloca() will NEVER be used after this function returns. do not return the pointer, or anything that contains it.
alloca or stack allocation is considerably faster than malloc/heap allocation. There should be a way to indicate either by inference of usage or by explicit call that the compiler should be able to emit a stack allocated memory buffer.
RETURN VALUE The alloca () function returns a pointer to the beginning of the allocated space. If the allocation causes stack overflow, program behaviour is undefined.
Actually, alloca is not guaranteed to use the stack. Indeed, the gcc-2.95 implementation of alloca allocates memory from the heap using malloc itself. Also that implementation is buggy, it may lead to a memory leak and to some unexpected behavior if you call it inside a block with a further use of goto.
The focus in the documentation is usually on the concept that alloca memory is associated with a function activation, not with any block; that multiple invocations of alloca just grab more stack memory which is all released when the function terminates. Not so; the memory is actually associated with the procedure context.
Nonlocal exits done with longjmp (see Non-Local Exits) automatically free the space allocated with alloca when they exit through the function that called alloca. This is the most important reason to use alloca Show activity on this post.
I'm quite unconfortable with the approach however:
- Am I missing some obvious danger of alloca ?
You pointed the one real danger out: stack overflow behaviour is undefined for alloca
. In addition, alloca
isn’t actually standardised. For instance, Visual C++ has _alloca
instead, and GCC by default defines it as a macro. That problem can be circumvented fairly easily, however, by providing a thin wrapper around the few existing implementations.
- Is there a better way to avoid heap allocation here ?
Not really. C++14 will have a (potentially!) stack allocated variable-length array type. But until then, and when you consider std::array
not to be a good fit, go for alloca
in cases such as yours.
Minor nitpick though: your code is missing a cast of the return value of alloca
. It shouldn’t even compile.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With