Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorizing a gradient descent algorithm

Tags:

I am coding gradient descent in matlab. For two features, I get for the update step:

temp0 = theta(1,1) - (alpha/m)*sum((X*theta-y).*X(:,1)); temp1 = theta(2,1) - (alpha/m)*sum((X*theta-y).*X(:,2)); theta(1,1) = temp0; theta(2,1) = temp1; 

However, I want to vectorize this code and to be able to apply it to any number of features. For the vectorization part, it shows that what I am trying to do is a matrix multiplication

theta = theta - (alpha/m) * (X' * (X*theta-y)); 

This is well seen, but when I tried, I realized that it doesn't work for gradient descent because the parameters are not updated simultaneously.

Then, how can I vectorize this code and make sure the parameters and updated at the same time?

like image 434
bigTree Avatar asked Dec 23 '13 03:12

bigTree


People also ask

What is the algorithm for gradient descent?

Gradient Descent Algorithm iteratively calculates the next point using gradient at the current position, scales it (by a learning rate) and subtracts obtained value from the current position (makes a step). It subtracts the value because we want to minimise the function (to maximise it would be adding).

What is a vectorized algorithm?

Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values (vector) at one time. Modern CPUs provide direct support for vector operations where a single instruction is applied to multiple data (SIMD).

What is vectorizing in machine learning?

In Machine Learning, vectorization is a step in feature extraction. The idea is to get some distinct features out of the text for the model to train on, by converting text to numerical vectors.

What is gradient descent algorithm with example?

The gradient descent procedure is an algorithm for finding the minimum of a function. Suppose we have a function f(x), where x is a tuple of several variables,i.e., x = (x_1, x_2, … x_n). Also, suppose that the gradient of f(x) is given by ∇f(x). We want to find the value of the variables (x_1, x_2, …


2 Answers

For the vectorized version try the following(two steps to make simultaneous update explicitly) :

 gradient = (alpha/m) * X' * (X*theta -y)  theta = theta - gradient 
like image 69
S.Arora Avatar answered Oct 26 '22 16:10

S.Arora


Your vectorization is correct. I also tried both of your code, and it got me the same theta. Just remember don't use your updated theta in your second implementation.

This also works but less simplified than your 2nd implementation:

Error = X * theta - y; for i = 1:2     S(i) = sum(Error.*X(:,i)); end  theta = theta - alpha * (1/m) * S' 
like image 45
lennon310 Avatar answered Oct 26 '22 16:10

lennon310