C++ exp LUT (lookup table)

Question

In a C++ CPU-bound simulation that I'm writing, I've traced a bottleneck via valgrind in my program to cmath::exp. It currently eats over 40% of my simulation time. I can bound the input to a relatively small domain, but I'd like to control the accuracy. I was considering moving to a LUT (lookup table) to replace exp but I'm not quite sure how to do this the "right way"(tm). Concerns I have:

Large lookup tables will not fit in the cache thus slowing access
Best way to convert a double input to an integer for access into the lookup table
Does the answer to (2) depend on the slope of the input function?
Am I reinventing the wheel - has this already been done before?

What is the best way to implement/(include from a library) a LUT for exp?

Dietrich Epp · Accepted Answer

The optimal lookup table size is determined by the trade-off you make between performance, accuracy, and implementation complexity. You will have to profile, we cannot tell you the answer (we don't know the answer).
Use lrint from <math.h> to convert double to long int. I'm not sure if it's in <cmath>.
I am not sure what slope has to do with converting floating point numbers to integers. Could you elaborate on what you are worried about?
Yes, you are reinventing the wheel. What you are doing has been done over and over again, by anybody who has ever implemented a math library. There is a lot of literature on the subject.

A straight look-up table is far from optimal. You will want to use some kind of polynomial approximation, perhaps a piecewise one with coefficients selected from a look-up table. For a function as smooth and predictable as exp, a polynomial will give you much higher accuracy for the same amount of computational effort. The required polynomials will depend on the tradeoff between complexity and accuracy, as well as whether you want to minimize expected error, minimize maximum error, or use some other loss function.

Restricting the domain of exp does not actually help that much, since it's so easy to expand to the entire domain.

// only works for x in [0, 1]
double exp2_limited(double x);

// works for all x, but doesn't handle overflow properly
double exp2(double x)
{
    return scalbn(exp2_limited(fmod(x, 1.0)), (long) floor(x));
}

Summary:

You have to know the required accuracy before you can design such a function.
You also have to know the loss function (i.e., choose the loss function).
You have to profile before you know how fast it is.
Use polynomials.

Mike Dunlavey · Answer

I've had this problem and I took a few stack samples to diagnose it. What that does is tell where the calls are coming from and what the argument value is. I found that when exp was called from particular places, the argument value was highly repeatable.

That suggested a memoization approach, which made a huge difference.

It needed a simple "wrapper" function:

double exp_cached(double arg, double* old_arg, double* old_result){
  if (arg== *old_arg) return *old_result;
  *old_arg = arg;
  *old_result = exp(arg);
  return *old_result;
}

and wherever exp(foo) used to be called, do:

static double old_arg = -999999999, old_result;
...
... exp_cached(foo, &old_arg, &old_result)...

That way, exp doesn't get called if its argument at that place where it is called from has the same argument value as before.

C++ exp LUT (lookup table)

Tags:

c++

profiling

lookup-tables

Hooked

2 Answers

Dietrich Epp

Mike Dunlavey

Recent Activity

Donate For Us

C++ exp LUT (lookup table)

Tags:

c++

profiling

lookup-tables

Hooked

2 Answers

Dietrich Epp

Mike Dunlavey

Related questions

Recent Activity

Donate For Us