gradient descent using python and numpy

Tags:

def gradient(X_norm,y,theta,alpha,m,n,num_it):     temp=np.array(np.zeros_like(theta,float))     for i in range(0,num_it):         h=np.dot(X_norm,theta)         #temp[j]=theta[j]-(alpha/m)*(  np.sum( (h-y)*X_norm[:,j][np.newaxis,:] )  )         temp[0]=theta[0]-(alpha/m)*(np.sum(h-y))         temp[1]=theta[1]-(alpha/m)*(np.sum((h-y)*X_norm[:,1]))         theta=temp     return theta    X_norm,mean,std=featureScale(X) #length of X (number of rows) m=len(X) X_norm=np.array([np.ones(m),X_norm]) n,m=np.shape(X_norm) num_it=1500 alpha=0.01 theta=np.zeros(n,float)[:,np.newaxis] X_norm=X_norm.transpose() theta=gradient(X_norm,y,theta,alpha,m,n,num_it) print theta

My theta from the above code is 100.2 100.2, but it should be 100.2 61.09 in matlab which is correct.

207

asked Jul 22 '13 09:07

2 Answers

Below you can find my implementation of gradient descent for linear regression problem.

At first, you calculate gradient like X.T * (X * w - y) / N and update your current theta with this gradient simultaneously.

X: feature matrix
y: target values
w: weights/values
N: size of training set

Here is the python code:

import pandas as pd import numpy as np from matplotlib import pyplot as plt import random  def generateSample(N, variance=100):     X = np.matrix(range(N)).T + 1     Y = np.matrix([random.random() * variance + i * 10 + 900 for i in range(len(X))]).T     return X, Y  def fitModel_gradient(x, y):     N = len(x)     w = np.zeros((x.shape[1], 1))     eta = 0.0001      maxIteration = 100000     for i in range(maxIteration):         error = x * w - y         gradient = x.T * error / N         w = w - eta * gradient     return w  def plotModel(x, y, w):     plt.plot(x[:,1], y, "x")     plt.plot(x[:,1], x * w, "r-")     plt.show()  def test(N, variance, modelFunction):     X, Y = generateSample(N, variance)     X = np.hstack([np.matrix(np.ones(len(X))).T, X])     w = modelFunction(X, Y)     plotModel(X, Y, w)   test(50, 600, fitModel_gradient) test(50, 1000, fitModel_gradient) test(100, 200, fitModel_gradient)

test1 test2 test2

answered Sep 17 '22 16:09

Muatik

I think your code is a bit too complicated and it needs more structure, because otherwise you'll be lost in all equations and operations. In the end this regression boils down to four operations:

Calculate the hypothesis h = X * theta
Calculate the loss = h - y and maybe the squared cost (loss^2)/2m
Calculate the gradient = X' * loss / m
Update the parameters theta = theta - alpha * gradient

In your case, I guess you have confused m with n. Here m denotes the number of examples in your training set, not the number of features.

Let's have a look at my variation of your code:

import numpy as np import random  # m denotes the number of examples here, not the number of features def gradientDescent(x, y, theta, alpha, m, numIterations):     xTrans = x.transpose()     for i in range(0, numIterations):         hypothesis = np.dot(x, theta)         loss = hypothesis - y         # avg cost per example (the 2 in 2*m doesn't really matter here.         # But to be consistent with the gradient, I include it)         cost = np.sum(loss ** 2) / (2 * m)         print("Iteration %d | Cost: %f" % (i, cost))         # avg gradient per example         gradient = np.dot(xTrans, loss) / m         # update         theta = theta - alpha * gradient     return theta   def genData(numPoints, bias, variance):     x = np.zeros(shape=(numPoints, 2))     y = np.zeros(shape=numPoints)     # basically a straight line     for i in range(0, numPoints):         # bias feature         x[i][0] = 1         x[i][1] = i         # our target variable         y[i] = (i + bias) + random.uniform(0, 1) * variance     return x, y  # gen 100 points with a bias of 25 and 10 variance as a bit of noise x, y = genData(100, 25, 10) m, n = np.shape(x) numIterations= 100000 alpha = 0.0005 theta = np.ones(n) theta = gradientDescent(x, y, theta, alpha, m, numIterations) print(theta)

At first I create a small random dataset which should look like this:

Linear Regression

As you can see I also added the generated regression line and formula that was calculated by excel.

You need to take care about the intuition of the regression using gradient descent. As you do a complete batch pass over your data X, you need to reduce the m-losses of every example to a single weight update. In this case, this is the average of the sum over the gradients, thus the division by m.

The next thing you need to take care about is to track the convergence and adjust the learning rate. For that matter you should always track your cost every iteration, maybe even plot it.

If you run my example, the theta returned will look like this:

Iteration 99997 | Cost: 47883.706462 Iteration 99998 | Cost: 47883.706462 Iteration 99999 | Cost: 47883.706462 [ 29.25567368   1.01108458]

Which is actually quite close to the equation that was calculated by excel (y = x + 30). Note that as we passed the bias into the first column, the first theta value denotes the bias weight.

135

answered Sep 17 '22 16:09

Thomas Jungblut

Related questions
                            
                                Nested list comprehension with two lists
                            
                                export notebook to pdf without code [duplicate]
                            
                                Reading two text files line by line simultaneously
                            
                                How to convert column with string type to int form in pyspark data frame?
                            
                                Identify the changed fields in django post_save signal
                            
                                ValueError: Unknown label type: 'unknown'
                            
                                asyncio.run() cannot be called from a running event loop
                            
                                Using Python and BeautifulSoup (saved webpage source codes into a local file)
                            
                                Multivariate normal density in Python?
                            
                                How to get the PYTHONPATH in shell?
                            
                                How to make file creation an atomic operation?
                            
                                Using python Requests with javascript pages
                            
                                Matplotlib xticks not lining up with histogram
                            
                                Find the number of occurrences of a subsequence in a string
                            
                                "synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'." problem in TensorFlow
                            
                                Good or bad practice in Python: import in the middle of a file [duplicate]
                            
                                Atomic increment of a counter in django
                            
                                Modifying list while iterating [duplicate]
                            
                                Check if a string in a Pandas DataFrame column is in a list of strings
                            
                                Python how to exit main function [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

gradient descent using python and numpy

Tags:

python

machine-learning

numpy

linear-regression

gradient-descent

Madan Ram

People also ask

2 Answers

Muatik

Thomas Jungblut

Recent Activity

Donate For Us