Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optim Julia parameter meaning

Tags:

julia

optim

I'm trying to use Optim in Julia to solve a two variable minimization problem, similar to the following

x = [1.0, 2.0, 3.0]
y = 1.0 .+ 2.0 .* x .+ [-0.3, 0.3, -0.1]

function sqerror(betas, X, Y)
    err = 0.0
    for i in 1:length(X)
        pred_i = betas[1] + betas[2] * X[i]
        err += (Y[i] - pred_i)^2
    end
    return err
end

res = optimize(b -> sqerror(b, x, y), [0.0,0.0])
res.minimizer

I do not quite understand what [0.0,0.0] means. By looking at the document http://julianlsolvers.github.io/Optim.jl/v0.9.3/user/minimization/. My understanding is that it is the initial condition. However, if I change that to [0.0,0., 0.0], the algorithm still work despite the fact that I only have two unknowns, and the algorithm gives me three instead of two minimizer. I was wondering if anyone knows what[0.0,0.0] really stands for.

like image 614
jgr Avatar asked Nov 03 '25 07:11

jgr


1 Answers

It is initial value. optimize by itself cannot know how many values your sqerror function takes. You specify it by passing this initial value.

For example if you add dimensionality check to sqerror you will get a proper error:

julia> function sqerror(betas::AbstractVector, X::AbstractVector, Y::AbstractVector)
           @assert length(betas) == 2
           err = 0.0
           for i in eachindex(X, Y)
               pred_i = betas[1] + betas[2] * X[i]
               err += (Y[i] - pred_i)^2
           end
           return err
       end
sqerror (generic function with 2 methods)

julia> optimize(b -> sqerror(b, x, y), [0.0,0.0,0.0])
ERROR: AssertionError: length(betas) == 2

Note that I also changed the loop condition to eachindex(X, Y) to ensure that your function checks if X and Y vectors have aligned indices.

Finally if you want performance and reduce compilation cost (so e.g. assuming you do this optimization many times) it would be better to define your optimized function like this:

objective_factory(x, y) = b -> sqerror(b, x, y)
optimize(objective_factory(x, y), [0.0,0.0])
like image 106
Bogumił Kamiński Avatar answered Nov 05 '25 14:11

Bogumił Kamiński



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!