Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fit plane to a set of points in 3D: scipy.optimize.minimize vs scipy.linalg.lstsq

Given a set of points in 3D, the general problem is to find the a, b, c coefficients of a plane equation in the form:

z = a*x + b*y + c

such that the resulting plane is the best fit possible to that set of points.

  1. In this SO answer, the function scipy.optimize.minimize is used to solve this problem.

    It relies on initial guesses for the coefficients, and minimizes an error function that sums the distances of each point to the surface of the plane.

  2. In this code (based on this other SO answer) the scipy.linalg.lstsq function is used to tackle the same problem (when restricted to a 1st order polynomial).

    It solves for C in the equation z = A*C, where A is the concatenation of the x,y coordinates of the set of points, z is the z coordinate of the set, and C are the a,b,c coefficients.

    Unlike the code in the method above, this one does not seem to require initial guesses for the plane coefficients.

Since the minimize function requires an initial guess, this means that it may or may not converge to the optimal solution (depending on how good the guess is). Does the second method have a similar caveat or will it return an always exact solution?

like image 855
Gabriel Avatar asked Jan 28 '16 19:01

Gabriel


2 Answers

Least squares (scipy.linalg.lstsq) is guaranteed to converge. In fact, there is a closed form analytical solution (given by (A^T A)^-1 A^Tb (where ^T is matrix transpose and ^-1 is matrix inversion)

The standard optimization problem, however, is not generally solvable - we are not guaranteed to find a minimizing value. However, for the given equation, find some a, b, c such that z = a*x + b*y + c, we have a linear optimization problem (the constraints and objective are linear in the variables that we're trying to optimize for). Linear optimization problems are generally solvable, so scipy.optimize.minimize should converge to the optimal value.

Note: this is linear in our constraints even if we do z = a*x + b*y + d*x^2 + e*y^2 + f -- we don't have to restrict ourselves to a linear space of (x,y), since we will have these points (x, y, x^2, y^2) already. To our algorithm, these just look like points in the matrix A. So we can actually get a higher order polynomial using least squares!

A brief aside: In reality, all of the solvers which don't use an exact analytical solution generally stop within some acceptable range of the actual answer, so it is rarely the case that we get the exact solution, but it tends to be so close that we accept it as exact in practice. Additionally, even least-squares solvers rarely use the analytical solution and instead resort to something quicker like Newton's Method.

If you were to change the optimization problem, this would not be true. There are certain classes of problems for which we can generally find an optimal value (the largest class of these are called convex optimization problems -- although there are many non-convex problems which we can find an optimal value for under certain conditions).

If you're interested in learning more, take a look at Convex Optimization by Boyd and Vandenberghe. The first chapter doesn't require much mathematical background and it overviews the general optimization problem and how it relates to solvable optimization problems like linear and convex programming.

like image 200
Alex Alifimoff Avatar answered Oct 22 '22 11:10

Alex Alifimoff


I would like to complete the answer with an alternative method in order to find the best plane that fit a set of points in R^3. Actually, the lstsq approach works pretty well except in specific cases where (for example) the x coordinate of all points is 0 (or the same). In such a case, the columns of the A matrix used in lstsq are not linearly independent. For example:

A = [[ 0   y_0    1]
     [ 0   y_1    1]
     ...
     [ 0   y_k    1] 
     ...
     [ 0   y_N    1]]

To circumvent this problem, you can use directly svd on the centered coordinates of the set of points. Actually, svd is used in the lstsq but not in the same matrix.

This is a python example given the coordinates of the points in the coords array :

# barycenter of the points
# compute centered coordinates
G = coords.sum(axis=0) / coords.shape[0]

# run SVD
u, s, vh = np.linalg.svd(coords - G)

# unitary normal vector
u_norm = vh[2, :]

Using this approach, the vh matrix is a 3x3 matrix which contains in its rows orthonormal vectors. The first two vectors form an orthonormal basis in the plane, the third one is a unit vector normal to the plane.

If you really need the a, b, c parameters, you can get them from the normal vector because the coordinates of the normal vector are (a, b, c), assuming the equation of the plane is ax + by + cz + d = 0.

like image 44
Ger Avatar answered Oct 22 '22 09:10

Ger