Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find minimum of nonlinear, multivariate function using Newton's method (code not linear algebra)

I'm trying to do some parameter estimation and want to choose parameter estimates that minimize the square error in a predicted equation over about 30 variables. If the equation were linear, I would just compute the 30 partial derivatives, set them all to zero, and use a linear-equation solver. But unfortunately the equation is nonlinear and so are its derivatives.

If the equation were over a single variable, I would just use Newton's method (also known as Newton-Raphson). The Web is rich in examples and code to implement Newton's method for functions of a single variable.

Given that I have about 30 variables, how can I program a numeric solution to this problem using Newton's method? I have the equation in closed form and can compute the first and second derivatives, but I don't know quite how to proceed from there. I have found a large number of treatments on the web, but they quickly get into heavy matrix notation. I've found something moderately helpful on Wikipedia, but I'm having trouble translating it into code.

Where I'm worried about breaking down is in the matrix algebra and matrix inversions. I can invert a matrix with a linear-equation solver but I'm worried about getting the right rows and columns, avoiding transposition errors, and so on.

To be quite concrete:

  • I want to work with tables mapping variables to their values. I can write a function of such a table that returns the square error given such a table as argument. I can also create functions that return a partial derivative with respect to any given variable.

  • I have a reasonable starting estimate for the values in the table, so I'm not worried about convergence.

  • I'm not sure how to write the loop that uses an estimate (table of value for each variable), the function, and a table of partial-derivative functions to produce a new estimate.

That last is what I'd like help with. Any direct help or pointers to good sources will be warmly appreciated.


Edit: Since I have the first and second derivatives in closed form, I would like to take advantage of them and avoid more slowly converging methods like simplex searches.

like image 922
Norman Ramsey Avatar asked Dec 24 '08 20:12

Norman Ramsey


People also ask

How can Newton Raphson method be used to solve non linear equations?

This appendix describes the most common method for solving a system of nonlinear equations, namely, the Newton-Raphson method. This is an iterative method that uses initial values for the unknowns and, then, at each iteration, updates these values until no change occurs in two consecutive iterations.

How do you solve a system of equations using Newton's method?

The idea of Newton's method in two variables is the same as in one variable. Just like the one variable case, we choose a starting point, locally linearize, solve the system of linear equations, and repeat. Thus, Newton's method in two variables can be seen as iteration of N.

How do you find non linear functions?

To identify whether a table represents a nonlinear function, compute the differences of every two rows for both x and y values. Then compute the ratios of corresponding differences (y/x). If all the ratios are NOT the same, then the table represents a nonlinear function.

What does the term quasi Newton techniques mean?

The definition of quasi-Newton methods that includes Newton's method as a particular case is adopted. However, especial emphasis is given to the methods that satisfy the secant equation at every iteration, which are called here, as usually, secant methods.


1 Answers

The Numerical Recipes link was most helpful. I wound up symbolically differentiating my error estimate to produce 30 partial derivatives, then used Newton's method to set them all to zero. Here are the highlights of the code:

__doc.findzero = [[function(functions, partials, point, [epsilon, steps]) returns table, boolean
Where
  point       is a table mapping variable names to real numbers 
              (a point in N-dimensional space)
  functions   is a list of functions, each of which takes a table like
              point as an argument
  partials    is a list of tables; partials[i].x is the partial derivative
              of functions[i] with respect to 'x'
  epilson     is a number that says how close to zero we're trying to get
  steps       is max number of steps to take (defaults to infinity)
  result      is a table like 'point', boolean that says 'converged'
]]

-- See Numerical Recipes in C, Section 9.6 [http://www.nrbook.com/a/bookcpdf.php]




function findzero(functions, partials, point, epsilon, steps)
  epsilon = epsilon or 1.0e-6
  steps = steps or 1/0
  assert(#functions > 0)
  assert(table.numpairs(partials[1]) == #functions,
         'number of functions not equal to number of variables')
  local equations = { }
  repeat
    if Linf(functions, point) <= epsilon then
      return point, true
    end
    for i = 1, #functions do
      local F = functions[i](point)
      local zero = F
      for x, partial in pairs(partials[i]) do
        zero = zero + lineq.var(x) * partial(point)
      end
      equations[i] = lineq.eqn(zero, 0)
    end
    local delta = table.map(lineq.tonumber, lineq.solve(equations, {}).answers)
    point = table.map(function(v, x) return v + delta[x] end, point)
    steps = steps - 1
  until steps <= 0
  return point, false
end


function Linf(functions, point)
  -- distance using L-infinity norm
  assert(#functions > 0)
  local max = 0
  for i = 1, #functions do
    local z = functions[i](point)
    max = math.max(max, math.abs(z))
  end
  return max
end
like image 174
Norman Ramsey Avatar answered Oct 20 '22 17:10

Norman Ramsey