I am using the method of steepest descent to figure out the solution to a linear system with a 5x5 Hilbert matrix. I believe the code is fine in the regard that it gives me the right answer.
My problem is that:
I think it is taking too many iterations to get to the right answer. I believe I may have missed something in the algorithm but I'm not sure what at this point.
I am not sure if this is the most effective way to implement the algorithm and additionally, it is a little confusing on which "tol" to pick.
Any insight on these would be appreciated (especially 1.). Thanks!
% Method of Steepest Descent with tol 10^-6
h = hilb(5);                            %Hilbert 5x5 matrix
b = [1;1;1;1;1];                        %solution matrix
solution = zeros(d,1);                  %Initialization 
residual = h*solution - b;
tol = 10^(-6)
count = 0; 
while residual'*residual > tol;
    roe = (residual'*residual)/(residual'*h*residual);
    solution = solution - roe*residual;
    residual = h*solution - b;
    count = count + 1;
end
count 
solution 
%Method of Steepest Descent with tol 10^-12
solution = zeros(d,1);
residual = h*solution - b;
tol = 10^(-12)
count = 0; 
while residual'*residual > tol;
    roe = (residual'*residual)/(residual'*h*residual);
    solution = solution - roe*residual;
    residual = residual - roe*h*residual;
    count = count + 1;
end
count
solution
%another_solution = invhilb(5)*b           %Check for solution
It looks like you implemented the algorithm correctly (steepest descent/gradient descent with exact line-search to minimize a convex quadratic function).
Convergence is slow because the problem is ill-conditioned: the Hilbert matrix you consider has a condition number above 400000. Gradient descent is known to be slow when the problem is ill-conditioned.
Considering a well-conditioned problem instead, for example by adding the identity to the Hilbert matrix (h = hilb(5)+eye(5)), the same code terminates after only 7 iterations (and the condition number for that matrix is less than 3).
I do not have the knowledge to deal with your problem from a mathematical aspect, but from a programming point of view there is a point I could note.
Indeed you are right. It is taking too many iterations until it gets to the result:
Elapsed time is 6.487824 seconds.
count =
      292945
When you define a step size to approach a final result, the step length has to be optimized. Otherwise you are either too slow in getting close to the answer or you are passing by the optimal answer several times and making a zigzag around it because your step length is too large.
To optimize the step size, I first form a function according to your script (plus some small modifications):
function [solution, count, T] = SDhilb(d, step)
h = hilb(d);
tic
b = ones(d, 1);
solution = zeros(d, 1);
residual = h * solution - b;
res2 = residual' * residual;
tol = 10^(-6);
count = 0;
while res2 > tol;
    roe = res2/(residual' * h * residual);
    solution = solution - step * roe * residual; % here the step size appears
    residual = h * solution - b;
    count = count + 1;
    res2 = residual' * residual; % let's calculate this once per iteration
end
T = toc;
Now using this function for a range of step sizes (0.1:0.1:1.3) and couple of Hilbert matrices (d = 2, 3, 4, 5) it is obvious that 1 is not an efficient step size:

Instead, step = 0.9 seems to be much more efficient. let's see how efficient it is in case of hilb(5):
[result, count, T] = SDhilb(5, 0.9)
result =
    3.1894
  -85.7689
  481.4906
 -894.8742
  519.5830
count =
        1633
T =
    0.0204
Which is more than two orders of magnitude faster.
In a similar way you can try different values of tol to see how dramatical it can influence the speed. In that case there is no optimal value: The smaller the tolerance, the more time you need to wait.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With