I implemented, for training purpose, a linear regression in python. The problem is that the cost is increasing instead of decreasing. For the data I use the Airfoil Self-Noise Data Set. Data can be found here
I import data as follow :
import pandas as pd
def features():
features = pd.read_csv("data/airfoil_self_noise/airfoil_self_noise.dat.txt", sep="\t", header=None)
X = features.iloc[:, 0:5]
Y = features.iloc[:, 5]
return X.values, Y.values.reshape(Y.shape[0], 1)
My code for the linear regression is the following :
import numpy as np
import random
class linearRegression():
def __init__(self, learning_rate=0.01, max_iter=20):
"""
Initialize the hyperparameters of the linear regression.
:param learning_rate: the learning rate
:param max_iter: the max numer of iteration to perform
"""
self.lr = learning_rate
self.max_iter = max_iter
self.m = None
self.weights = None
self.bias = None
def fit(self, X, Y):
"""
Run gradient descent algorithm
:param X: the inputs
:param Y: the outputs
:return:
"""
self.m = X.shape[0]
self.weights = np.random.normal(0, 0.1, (X.shape[1], 1))
self.bias = random.normalvariate(0, 0.1)
for iter in range(0, self.max_iter):
A = self.__forward(X)
dw, db = self.__backward(A, X, Y)
J = (1/(2 * self.m)) * np.sum(np.power((A - Y), 2))
print("at iteration %s cost is %s" % (iter, J))
self.weights = self.weights - self.lr * dw
self.bias = self.bias - self.lr * db
def predict(self, X):
"""
Make prediction on the inputs
:param X: the inputs
:return:
"""
Y_pred = self.__forward(X)
return Y_pred
def __forward(self, X):
"""
Compute the linear function on the inputs
:param X: the inputs
:return:
A: the activation
"""
A = np.dot(X, self.weights) + self.bias
return A
def __backward(self, A, X, Y):
"""
:param A: the activation
:param X: the inputs
:param Y: the outputs
:return:
dw: the gradient for the weights
db: the gradient for the bias
"""
dw = (1 / self.m) * np.dot(X.T, (A - Y))
db = (1 / self.m) * np.sum(A - Y)
return dw, db
Then I instantiate the linearRegression class as follow :
X, Y = features()
model = linearRegression()
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=42)
model.fit(X_train, y_train)
I tried to find why the cost is increasing but so far I was not able to find out why. If someone could point me in the right direction it would be appreciated.
For the Linear regression model, the cost function will be the minimum of the Root Mean Squared Error of the model, obtained by subtracting the predicted values from actual values. The cost function will be the minimum of these error values.
The Cost Function of Linear Regression: The cost function is the average error of n-samples in the data (for the whole training data) and the loss function is the error for individual data points (for one training example). The cost function of a linear regression is root mean squared error or mean squared error.
Mean Squared Error is the sum of the squared differences between the prediction and true value. And the output is a single number representing the cost. So the line with the minimum cost function or MSE represents the relationship between X and Y in the best possible manner.
In the Machine Learning world, Linear Regression is a kind of parametric regression model that makes a prediction by taking the weighted average of the input features of an observation or data point and adding a constant called the bias term.
Normally if you chose a large learning rate you may have a similar problem. I have tried to examine your code and my observations are:
Your learning rate is much too high. When I run your code unmodified with except for a learning rate of 1e-7 instead of 0.01, I get reliably decreasing costs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With