Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function approximation

I have a function,

P(x0, x1, ..., xn)

that takes 100 integers as input and gives as output an integer. P is a slow function to evaluate(it can range from 30 seconds to a couple of minutes).

I need to know which values of points will maximize the yielded value from P.

What techniques can I use to accomplish this? I know generally people use genetic algorithms for this, but I'm afraid it will take ages to compute it with them, as even with a small population and few generations (let's say, population = 50, generations = 50), P is so slow it will take more than 40 hours to compute it.

Is there any cheaper method of doing it? Maybe an iterative process? I don't need it to be really optimal, but I don't have any ideia of how it behaves (I've tried linear / quadratic / exponential but it doesn't seem to yield any good values. I know P can return values at least 5-10 times better than what I'm getting).

It should be something that's easier to implement (i.e., I must implement it myself).

Thanks

edit: P is a stochastic process.

like image 723
devoured elysium Avatar asked Dec 11 '09 14:12

devoured elysium


People also ask

What is function approximation in reinforcement learning?

In summary the function approximation helps finding the value of a state or an action when similar circumstances occur, whereas in computing the real values of V and Q requires a full computation and does not learn from past experience. Furthermore function approximation saves computation time and memory space.

What is function approximation in machine learning?

Function approximation is a technique for estimating an unknown underlying function using historical or available observations from the domain. Artificial neural networks learn to approximate a function.

How do we approximate a function?

If one has the function value and n derivatives at one point, x0, then one can calculate a polynomial approximation using the Taylor expansion. f(x) ≈ f(x0)+(x−x0) ∂f(x) ∂x ||||x=xo +ООО+ (x − x0)n n!

Why is function approximation?

The need for function approximations arises in many branches of applied mathematics, and computer science in particular, such as predicting the growth of microbes in microbiology. Function approximations are used where theoretical models are unavailable or hard to compute.


1 Answers

Simulated annealing, closely related to Markov Chain Monte Carlo (MCMC). The variant you probably want is Metropolis-Hastings. When you get the hang of it, it's quite nice. Possibly there are some ways to optimize it because your inputs and result are all integers. It is compute-intensive and may require some tuning, but it is quite robust, and I'm not sure other methods could do any better.

Here's some brain-dead code to do it:

const int n = 100; // length of vector to optimize
int a[n]; // the vector to optimize
double P(a){..} // Get the probability of vector a.
                // This is the function to optimize.
// for a large number of a samples
for (i = 0; i < large_number; i++){
  // get P(a)
  double p = P(a);
  // for each element of vector a
  for (j = 0; j < n; j++){
    // get an amount by which to change it. This choice has to be symmetric.
    // this is called the Proposal Distribution
    int step = uniform_random_choice_from(-2, -1, 1, 2);
    // make the change to a[j], and get p1, the new value of p
    a[j] += step;
    double p1 = P(a);
    bool bKeepTheStep = true;
    // if p1 is better than p, keep the step
    // if p1 is worse than p, then keep the step p1/p of the time
    if (p1 < p){
      bKeepTheStep = (unif(0,1) < p1/p);
    }
    if (bKeepTheStep) p = p1;
    else a[j] -= step;
  }
  // now a is a sample, and p is its value
  // record a and p
}
// what you have now is a large random sampling of vectors from distribution P
// now you can choose the best one, the average, the variance,
// any statistic you like

Ways to tweak it are to widen or narrow the proposal distribution, so it takes larger or smaller steps, or you can have it initially take larger steps and then smaller steps. What you're looking for is a percentage of steps that are kept that is neither too high nor too low. You probably want to have a "burn-in" phase of an initial 1k or so samples that you throw away, while it hunts for the area of the mode.

And by all means, profile P. It needs to be as fast as possible. Here's my favorite way to do that.

like image 68
Mike Dunlavey Avatar answered Oct 07 '22 21:10

Mike Dunlavey