I am trying to implement this algorithm to find the intercept and slope for single variable:
Here is my Python code to update the Intercept and slope. But it is not converging. RSS is Increasing with Iteration rather than decreasing and after some iteration it's becoming infinite. I am not finding any error implementing the algorithm.How Can I solve this problem? I have attached the csv file too. Here is the code.
import pandas as pd import numpy as np #Defining gradient_decend #This Function takes X value, Y value and vector of w0(intercept),w1(slope) #INPUT FEATURES=X(sq.feet of house size) #TARGET VALUE=Y (Price of House) #W=np.array([w0,w1]).reshape(2,1) #W=[w0, # w1] def gradient_decend(X,Y,W): intercept=W[0][0] slope=W[1][0] #Here i will get a list #list is like this #gd=[sum(predicted_value-(intercept+slope*x)), # sum(predicted_value-(intercept+slope*x)*x)] gd=[sum(y-(intercept+slope*x) for x,y in zip(X,Y)), sum(((y-(intercept+slope*x))*x) for x,y in zip(X,Y))] return np.array(gd).reshape(2,1) #Defining Resudual sum of squares def RSS(X,Y,W): return sum((y-(W[0][0]+W[1][0]*x))**2 for x,y in zip(X,Y)) #Reading Training Data training_data=pd.read_csv("kc_house_train_data.csv") #Defining fixed parameters #Learning Rate n=0.0001 iteration=1500 #Intercept w0=0 #Slope w1=0 #Creating 2,1 vector of w0,w1 parameters W=np.array([w0,w1]).reshape(2,1) #Running gradient Decend for i in range(iteration): W=W+((2*n)* (gradient_decend(training_data["sqft_living"],training_data["price"],W))) print RSS(training_data["sqft_living"],training_data["price"],W)
Here is the CSV file.
Simple linear regression is an approach for predicting a response using a single feature. It is assumed that the two variables are linearly related. Hence, we try to find a linear function that predicts the response value(y) as accurately as possible as a function of the feature or independent variable(x).
We could use the equation to predict weight if we knew an individual's height. In this example, if an individual was 70 inches tall, we would predict his weight to be: Weight = 80 + 2 x (70) = 220 lbs. In this simple linear regression, we are examining the impact of one independent variable on the outcome.
Firstly, I find that when writing machine learning code, it's best NOT to use complex list comprehension because anything that you can iterate,
And using proper variable names can help you better understand the code. Using Xs, Ys, Ws as short hand is nice only if you're good at math. Personally, I don't use them in the code, especially when writing in python. From import this
: explicit is better than implicit.
My rule of thumb is to remember that if I write code I can't read 1 week later, it's bad code.
First, let's decide what is the input parameters for gradient descent, you will need:
X
matrix, type: numpy.array
, a matrix of N * D size, where N is the no. of rows/datapoints and D is the no. of columns/features)Y
vector, type: numpy.array
, a vector of size N)numpy.array
, a vector of size D).Additionally, to check for convergence you will need:
float
, usually a small number)float
, usually a small number but much bigger than the step size).Now to the code.
def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance): converged = False # Set a boolean to check for convergence weights = np.array(initial_weights) # make sure it's a numpy array while not converged: # compute the predictions based on feature_matrix and weights. # iterate through the row and find the single scalar predicted # value for each weight * column. # hint: a dot product can solve this easily predictions = [??? for row in feature_matrix] # compute the errors as predictions - output errors = predictions - output gradient_sum_squares = 0 # initialize the gradient sum of squares # while we haven't reached the tolerance yet, update each feature's weight for i in range(len(weights)): # loop over each weight # Recall that feature_matrix[:, i] is the feature column associated with weights[i] # compute the derivative for weight[i]: # Hint: the derivative is = 2 * dot product of feature_column and errors. derivative = 2 * ???? # add the squared value of the derivative to the gradient magnitude (for assessing convergence) gradient_sum_squares += (derivative * derivative) # subtract the step size times the derivative from the current weight weights[i] -= (step_size * derivative) # compute the square-root of the gradient sum of squares to get the gradient magnitude: gradient_magnitude = ??? # Then check whether the magnitude is lower than the tolerance. if ???: converged = True # Once it while loop breaks, return the loop. return(weights)
I hope the extended pseudo-code helps you better understand the gradient descent. I won't fill in the ???
so as to not spoil your homework.
Note that your RSS code is also unreadable and unmaintainable. It's easier to do just:
>>> import numpy as np >>> prediction = np.array([1,2,3]) >>> output = np.array([1,1,5]) >>> residual = output - prediction >>> RSS = sum(residual * residual) >>> RSS 5
Going through numpy basics will go a long way to machine learning and matrix-vector manipulation without going nuts with iterations: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With