The cvx suite for MATLAB can solve the (seemingly innocent) optimization problem below, but it is rather slow for the large, full matrices I'm working with. I'm hoping this is because using cvx is overkill, and that the problem actually has an analytic solution, or that a clever use of some built-in MATLAB functions can more quickly do the job.
Background: It is well-known that both x1=A\b
and x2=pinv(A)*b
solve the least-squares problem:
minimize norm(A*x-b)
with the distinction that norm(x2)<=norm(x1)
. In fact, x2
is the minimum-norm solution to the problem, so norm(x2)<=norm(x)
for all possible solutions x
.
Defining D=norm(A*x2-b)
, (equivalently D=norm(A*x1-b)
), then x2
solves the problem
minimize norm(x)
subject to
norm(A*x-b) == D
Problem: I'd like to find the solution to:
minimize norm(x)
subject to
norm(A*x-b) <= D+threshold
In words, I don't need norm(A*x-b)
to be as small as possible, just within a certain tolerance. I want the minimum-norm solution x
that gets A*x
within D+threshold
of b
.
I haven't been able to find an analytic solution to the problem (like using the pseudoinverse in the classic least-squares problem) on the web or by hand. I've been searching things like "least squares with nonlinear constraint" and "least squares with threshold".
Any insights would be greatly appreciated, but I suppose my real question is: What is the fastest way to solve this "thresholded" least-squares problem in MATLAB?
Least squares and least norm in Matlab. Least squares approximate solution. Suppose A 2 Rm n is skinny (or square), i.e., m n, and full rank, which means that Rank(A) = n. The least-squares approximate solution of Ax = y is given by xls = (ATA) 1ATy: This is the unique x 2 Rn that minimizes kAx yk. There are several ways to compute xls in Matlab.
Least-Squares (Model Fitting) Algorithms. Least Squares Definition. Least squares, in general, is the problem of finding a vector x that is a local minimizer to a function that is a sum of squares, possibly subject to some constraints: such that A·x ≤ b, Aeq·x = beq, lb ≤ x ≤ ub.
▪Select an appropriate solver and algorithm ▪Interpret the output from the solver and diagnose the progress of an optimization © 2018 The MathWorks, Inc.110 Title Solving Optimization Problems with MATLAB Author Mary Fenelon Keywords Version 16.0 Created Date 10/31/2018 9:50:16 AM
Trust-Region-Reflective Least Squares Algorithm. Many of the methods used in Optimization Toolbox solvers are based on trust regions, a simple yet powerful concept in optimization. To understand the trust-region approach to optimization, consider the unconstrained minimization problem, minimize f...
Interesting question. I do not know the answer to your exact question, but a working solution is presented below.
Define res(x) := norm(Ax - b)
.
As you state, x2
minimizes res(x)
. In the overdetermined case (typically A
having more rows than col's), x2
is the unique minimum. In the underdetermined case, it is joined by infinitely many others*. However, among all of these, x2
is the unique one that minimizes norm(x)
.
To summarize, x2
minimizes (1) res(x)
and (2) norm(x)
, and it does so in that order of priority. In fact, this characterizes (fully determines) x2
.
But, another characterization of x2
is
x2 := limit_{e-->0} x_e
where
x_e := argmin_{x} J(x;e)
where
J(x;e) := res(x) + e * norm(x)
It can be shown that
x_e = (A A' + e I)^{-1} A' b (eqn a)
It should be appreciated that this characterization of x2
is quite magical. The limits exists even if (A A')^{-1}
does not. And the limit somehow preserves priority (2) from above.
Of course, for finite (but small) e
, x_e
will not minimize res(x)
(instead it minimzes J(x;e)
). In your terminology, the difference is the threshold. I will rename it to
gap := res(x_e) - min_{x} res(x).
Decreasing the value of e
is guaranteed to decrease the value of the gap
. Reaching a specific gap
value (i.e. the threshold) is therefore easy to achieve by tuning e
.**
This type of modification (adding norm(x)
to the res(x)
minimization problem) is known as regularization in statistics litterature, and is generally considered a good idea for stability (numerically and with respect to parameter values).
*: Note that x1
and x2
only differ in the underdetermined case
**:It does not even require any heavy computations, because the inverse in (eqn a)
is easily computed for any (positive) value of e
if the SVD of A has already been computed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With