Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorization of a gradient descent code

I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta. theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector.

During the update step, I need to set each theta(i) to

theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i))

This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term).

Any suggestion?

like image 255
bigTree Avatar asked Dec 22 '13 23:12

bigTree


People also ask

Why vectorized gradient descent?

Implementing a vectorized approach decreases the time taken for execution of Gradient Descent ( Efficient Code ). Easy to debug. Writing code in comment? Please use ide.geeksforgeeks.org , generate link and share the link here.

How do you find the minimum value of a gradient descent?

We move downward towards pits in the graph, to find the minimum value. The way to do this is taking derivative of cost function as explained in the above figure. Gradient Descent step-downs the cost function in the direction of the steepest descent. The size of each step is determined by parameters? known as Learning Rate .

What is gradient descent in machine learning?

What is Gradient Descent? Gradient descent is an optimization technique that can find the minimum of an objective function. It is a greedy technique that finds the optimal solution by taking a step in the direction of the maximum rate of decrease of the function.

How do you calculate gradient descent with circular contours?

Now that we have a general purpose implementation of gradient descent, let's run it on our example 2D function f (w1,w2) = w2 1 + w2 2 f ( w 1, w 2) = w 1 2 + w 2 2 with circular contours. The function has a minimum value of zero at the origin.


2 Answers

Looks like you are trying to do a simple matrix multiplication, the thing MATLAB is supposedly best at.

theta = theta - (alpha/m) * (X' * (X*theta-y));
like image 166
Mad Physicist Avatar answered Sep 23 '22 06:09

Mad Physicist


In addition to the answer given by Mad Physicist, the following can also be applied.

theta = theta - (alpha/m) * sum( (X * theta - y).* X )';

like image 41
Rishu Avatar answered Sep 22 '22 06:09

Rishu