I am using the Optim.jl library to minimise a function in Julia, using a BFGS algorithm. Today, I have asked a question about the same library, but to avoid confusion I decided to split it in two.
I would like also to get an estimate of the negative inverse Hessian after the optimisation, for further calculations.
In the GitHub website of the Optim library, I found the following working example:
using Optim
rosenbrock(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
result = optimize(rosenbrock, zeros(2), BFGS())
How can I get the negative inverse Hessian after the optimisation from result? Is there any field in it identifying either the Hessian, the Inverse Hessian or the negative inverse Hessian?
EDIT
Thanks for the comments. Do you think that it would be more efficient to edit "optimize.jl" so that the function would return also the inverse Hessian? See below for a working example - the edit has been introduced at line 226:
if state.method_string == "BFGS"
return MultivariateOptimizationResults(state.method_string,
initial_x,
f_increased ? state.x_previous : state.x,
f_increased ? state.f_x_previous : state.f_x,
iteration,
iteration == options.iterations,
x_converged,
options.x_tol,
f_converged,
options.f_tol,
g_converged,
options.g_tol,
f_increased,
tr,
state.f_calls,
state.g_calls,
state.h_calls), state.invH
else
return MultivariateOptimizationResults(state.method_string,
initial_x,
f_increased ? state.x_previous : state.x,
f_increased ? state.f_x_previous : state.f_x,
iteration,
iteration == options.iterations,
x_converged,
options.x_tol,
f_converged,
options.f_tol,
g_converged,
options.g_tol,
f_increased,
tr,
state.f_calls,
state.g_calls,
state.h_calls)
end
Or just:
return MultivariateOptimizationResults(state.method_string,
initial_x,
f_increased ? state.x_previous : state.x,
f_increased ? state.f_x_previous : state.f_x,
iteration,
iteration == options.iterations,
x_converged,
options.x_tol,
f_converged,
options.f_tol,
g_converged,
options.g_tol,
f_increased,
tr,
state.f_calls,
state.g_calls,
state.h_calls), state
To have full access to "state" after the optimization.
EDIT 2
Since this change will be introduced in the new version of the Optim.jl library, there is no need to continue this discussion. For now, the the extended_trace and the after_while! tricks work. Personally, I prefer the latter, so I will close the discussion giving @Dan Getz the correct answer.
I know of one way, but whether it's worth it as opposed to estimating the inverse Hessian yourself I'm not sure. If you pass Optim.Options(store_trace=true, extended_trace=true)
, you can get a record of the optimization path which includes the last ~invH. For example, after
result = optimize(rosenbrock, zeros(2), BFGS(),
Optim.Options(store_trace=true, extended_trace=true));
we can get
julia> result.trace[end]
16 5.471433e-17 2.333740e-09
* Current step size: 1.0091356566200325
* g(x): [2.33374e-9,-1.22984e-9]
* ~inv(H): [0.498092 0.996422; 0.996422 1.9983]
* x: [1.0,1.0]
julia> result.trace[end].metadata["~inv(H)"]
2×2 Array{Float64,2}:
0.498092 0.996422
0.996422 1.9983
There are at least two things I don't like about this approach, though:
The first is that turning on extended_trace=true
seems to force show_trace=true
-- you'll notice I didn't show the output of the computation! Feels like a bug. This can be mitigated (though not removed) by setting show_every
to something big, or avoided by redirecting the stdout entirely.
The second is that we should probably be able to get access to the last state without storing the whole path, but whether this is actually an issue will depend upon the size of your problem.
Another less than optimal approach which works is hooking up to the internal Optim function after_while!
which is currently doing nothing and using it to pull the the information from the last state.
In Julia code this looks like:
julia> using Optim
julia> rosenbrock(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
rosenbrock (generic function with 1 method)
julia> Optim.after_while!{T}(d, state::Optim.BFGSState{T}, method::BFGS, options)
= global invH = state.invH
julia> result = optimize(rosenbrock, zeros(2), BFGS())
Results of Optimization Algorithm
* Algorithm: BFGS
* Starting Point: [0.0,0.0]
* Minimizer: [0.9999999926033423,0.9999999852005353]
* Minimum: 5.471433e-17
* Iterations: 16
* Convergence: true
* |x - x'| < 1.0e-32: false
* |f(x) - f(x')| / |f(x)| < 1.0e-32: false
* |g(x)| < 1.0e-08: true
* f(x) > f(x'): false
* Reached Maximum Number of Iterations: false
* Objective Function Calls: 69
* Gradient Calls: 69
julia> invH
2×2 Array{Float64,2}:
0.498092 0.996422
0.996422 1.9983
This is unattractive for using a global variable and for depending on defining the after_while!
before running/compiling optimize
(but maybe in v0.6 this is already solved).
As @DSM noted in his answer, it is natural to want access to last state of optimizer. If tracing is not the answer maybe this is.
The current least hacky way to do this in Optim.jl is to do the following.
First, load Optim and OptimTestProblems (to have an example to work off of)
julia> using Optim, OptimTestProblems
julia> prob = OptimTestProblems.UnconstrainedProblems.examples["Himmelblau"]
OptimTestProblems.MultivariateProblems.OptimizationProblem{Void,Void,Float64,String,Void}("Himmelblau", OptimTestProblems.MultivariateProblems.UnconstrainedProblems.himmelblau, OptimTestProblems.MultivariateProblems.UnconstrainedProblems.himmelblau_gradient!, nothing, OptimTestProblems.MultivariateProblems.UnconstrainedProblems.himmelblau_hessian!, nothing, [2.0, 2.0], [3.0, 2.0], 0.0, true, true, nothing)
Then specify all the parts optimize
needs and enter them in correct order:
julia> x0 = prob.initial_x
2-element Array{Float64,1}:
2.0
2.0
julia> obj = OnceDifferentiable(prob.f, prob.g!, x0)
NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1},Val{false}}(OptimTestProblems.MultivariateProblems.UnconstrainedProblems.himmelblau, OptimTestProblems.MultivariateProblems.UnconstrainedProblems.himmelblau_gradient!, NLSolversBase.fg!, 0.0, [NaN, NaN], [NaN, NaN], [NaN, NaN], [0], [0])
julia> m = BFGS()
Optim.BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64},Optim.##67#69}(LineSearches.InitialStatic{Float64}
alpha: Float64 1.0
scaled: Bool false
, LineSearches.HagerZhang{Float64}
delta: Float64 0.1
sigma: Float64 0.9
alphamax: Float64 Inf
rho: Float64 5.0
epsilon: Float64 1.0e-6
gamma: Float64 0.66
linesearchmax: Int64 50
psi3: Float64 0.1
display: Int64 0
, Optim.#67, Optim.Flat())
julia> options = Optim.Options()
Optim.Options{Float64,Void}(1.0e-32, 1.0e-32, 1.0e-8, 0, 0, 0, false, 0, 1000, false, false, false, 1, nothing, NaN)
julia> bfgsstate = Optim.initial_state(m, options, obj, x0)
Optim.BFGSState{Array{Float64,1},Array{Float64,2},Float64,Array{Float64,1}}([2.0, 2.0], [6.91751e-310, 6.9175e-310], [-42.0, -18.0], NaN, [6.9175e-310, 0.0], [6.91751e-310, 0.0], [6.91751e-310, 0.0], [1.0 0.0; 0.0 1.0], [6.91751e-310, 0.0], NaN, [6.9175e-310, 6.91751e-310], 1.0, false, LineSearches.LineSearchResults{Float64}(Float64[], Float64[], Float64[], 0))
julia> res = optimize(obj, x0, m, options, bfgsstate)
Results of Optimization Algorithm
* Algorithm: BFGS
* Starting Point: [2.0,2.0]
* Minimizer: [2.9999999999998894,2.000000000000162]
* Minimum: 5.406316e-25
* Iterations: 7
* Convergence: true
* |x - x'| ≤ 1.0e-32: false
|x - x'| = 5.81e-09
* |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x')| = 2.93e+09 |f(x)|
* |g(x)| ≤ 1.0e-08: true
|g(x)| = 4.95e-12
* Stopped by an increasing objective: false
* Reached Maximum Number of Iterations: false
* Objective Calls: 42
* Gradient Calls: 42
Then we can access the the inverse Hessian from the state that was mutated in optimize
.
julia> bfgsstate.invH
2×2 Array{Float64,2}:
0.0160654 -0.00945561
-0.00945561 0.034967
And compare it to the inverse Hessian obtained by calculating the inverse of the actual Hessian.
julia> H=similar(bfgsstate.invH)
2×2 Array{Float64,2}:
6.91751e-310 6.91751e-310
6.91751e-310 6.91751e-310
julia> prob.h!(H, Optim.minimizer(res))
34.00000000000832
julia> H
2×2 Array{Float64,2}:
74.0 20.0
20.0 34.0
julia> inv(H)
2×2 Array{Float64,2}:
0.0160681 -0.0094518
-0.0094518 0.0349716
This is similar to the one obtained at the last step of the BFGS run.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With