Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Steepest descent to find the solution to a linear system with a Hilbert matrix

I am using the method of steepest descent to figure out the solution to a linear system with a 5x5 Hilbert matrix. I believe the code is fine in the regard that it gives me the right answer.

My problem is that:

  1. I think it is taking too many iterations to get to the right answer. I believe I may have missed something in the algorithm but I'm not sure what at this point.

  2. I am not sure if this is the most effective way to implement the algorithm and additionally, it is a little confusing on which "tol" to pick.

Any insight on these would be appreciated (especially 1.). Thanks!

% Method of Steepest Descent with tol 10^-6
h = hilb(5);                            %Hilbert 5x5 matrix
b = [1;1;1;1;1];                        %solution matrix
solution = zeros(d,1);                  %Initialization 
residual = h*solution - b;
tol = 10^(-6)
count = 0; 

while residual'*residual > tol;
    roe = (residual'*residual)/(residual'*h*residual);
    solution = solution - roe*residual;
    residual = h*solution - b;
    count = count + 1;
end

count 
solution 


%Method of Steepest Descent with tol 10^-12
solution = zeros(d,1);
residual = h*solution - b;
tol = 10^(-12)
count = 0; 

while residual'*residual > tol;
    roe = (residual'*residual)/(residual'*h*residual);
    solution = solution - roe*residual;
    residual = residual - roe*h*residual;
    count = count + 1;
end

count
solution

%another_solution = invhilb(5)*b           %Check for solution
like image 536
DudeWah Avatar asked Oct 18 '22 00:10

DudeWah


2 Answers

It looks like you implemented the algorithm correctly (steepest descent/gradient descent with exact line-search to minimize a convex quadratic function).

Convergence is slow because the problem is ill-conditioned: the Hilbert matrix you consider has a condition number above 400000. Gradient descent is known to be slow when the problem is ill-conditioned.

Considering a well-conditioned problem instead, for example by adding the identity to the Hilbert matrix (h = hilb(5)+eye(5)), the same code terminates after only 7 iterations (and the condition number for that matrix is less than 3).

like image 113
Fanfan Avatar answered Oct 21 '22 00:10

Fanfan


I do not have the knowledge to deal with your problem from a mathematical aspect, but from a programming point of view there is a point I could note.

Indeed you are right. It is taking too many iterations until it gets to the result:

Elapsed time is 6.487824 seconds.

count =

      292945

When you define a step size to approach a final result, the step length has to be optimized. Otherwise you are either too slow in getting close to the answer or you are passing by the optimal answer several times and making a zigzag around it because your step length is too large.

To optimize the step size, I first form a function according to your script (plus some small modifications):

function [solution, count, T] = SDhilb(d, step)
h = hilb(d);
tic
b = ones(d, 1);
solution = zeros(d, 1);
residual = h * solution - b;
res2 = residual' * residual;
tol = 10^(-6);
count = 0;
while res2 > tol;
    roe = res2/(residual' * h * residual);
    solution = solution - step * roe * residual; % here the step size appears
    residual = h * solution - b;
    count = count + 1;
    res2 = residual' * residual; % let's calculate this once per iteration
end
T = toc;

Now using this function for a range of step sizes (0.1:0.1:1.3) and couple of Hilbert matrices (d = 2, 3, 4, 5) it is obvious that 1 is not an efficient step size:

enter image description here

Instead, step = 0.9 seems to be much more efficient. let's see how efficient it is in case of hilb(5):

[result, count, T] = SDhilb(5, 0.9)

result =

    3.1894
  -85.7689
  481.4906
 -894.8742
  519.5830


count =

        1633


T =

    0.0204

Which is more than two orders of magnitude faster.

In a similar way you can try different values of tol to see how dramatical it can influence the speed. In that case there is no optimal value: The smaller the tolerance, the more time you need to wait.

like image 44
erfan Avatar answered Oct 21 '22 01:10

erfan