I would like to use optim() to optimize a cost function (fn argument), and I will be providing a gradient (gr argument). I can write separate functions for fn and gr. However, they have a lot of code in common and I don't want the optimizer to waste time repeating those calculations. So is it possible to provide one function that computes both the cost and the gradient? If so, what would be the calling syntax to optim()?
As an example, suppose the function I want to minimize is
cost <- function(x) {
x*exp(x)
}
Obviously, this is not the function I'm trying to minimize. That's too complicated to list here, but the example serves to illustrate the issue. Now, the gradient would be
grad <- function(x) {
(x+1)*exp(x)
}
So as you can see, the two functions, if called separately, would repeat some of the work (in this case, the exponential function). However, since optim() takes two separate arguments (fn and gr), it appears there is no way to avoid this inefficiency, unless there is a way to define a function like
costAndGrad <- function(x) {
ex <- exp(x)
list(cost=x*ex, grad=(x+1)*ex)
}
and then pass that function to optim(), which would need to know how to extract the cost and gradient.
Hope that explains the problem. Like I said my function is much more complicated, but the idea is the same: there is considerable code that goes into both calculations (cost and gradient), which I don't want to repeat unnecessarily.
By the way, I am an R novice, so there might be something simple that I'm missing!
Thanks very much
The function optim provides algorithms for general-purpose optimisations and the documentation is perfectly reasonable, but I remember that it took me a little while to get my head around how to pass data and parameters to optim.
By default optim performs minimization, but it will maximize if control$fnscale is negative.
optim(par, fn, data, ...) where: par: Initial values for the parameters to be optimized over. fn: A function to be minimized or maximized. data: The name of the object in R that contains the data.
nlm() function carries out a non linear minimization of the function f using a Newton-type algorithm. f : the function to be minimized, returning a single numeric value. This should be a function with first argument a vector of the length of p followed by any other arguments specified by the ... argument.
The nlm
function does optimization and it expects the gradient information to be returned as an attribute to the value returned as the original function value. That is similar to what you show above. See the examples in the help for nlm
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With