In the sklearn.linear_model.LinearRegression
method, there is a parameter that is fit_intercept = TRUE
or fit_intercept = FALSE
. I am wondering if we set it to TRUE, does it add an additional intercept column of all 1's to your dataset? If I already have a dataset with a column of 1's, does fit_intercept = FALSE
account for that or does it force it to fit a zero intercept model?
Update: It seems people do not get my question. The question is, what IF I had already a column of 1's in my dataset of predictors (the 1's are for the intercept). THEN,
if I use fit_intercept = FALSE
, will it remove the column of 1's?
if I use fit_intercept = TRUE
, will it add an EXTRA column of 1's?
fit_intercept=False sets the y-intercept to 0. If fit_intercept=True , the y-intercept will be determined by the line of best fit.
LinearRegression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation. Whether to calculate the intercept for this model.
Linear Regression Scoring: This type of scoring is performed by implementing linear regression algorithm on the random sample of data. The process includes scoring techniques on variables that have linear dependencies.
Prerequisite: Linear Regression Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting.
fit_intercept=False
sets the y-intercept to 0. If fit_intercept=True
, the y-intercept will be determined by the line of best fit.
from sklearn.linear_model import LinearRegression from sklearn.datasets import make_regression import numpy as np import matplotlib.pyplot as plt bias = 100 X = np.arange(1000).reshape(-1,1) y_true = np.ravel(X.dot(0.3) + bias) noise = np.random.normal(0, 60, 1000) y = y_true + noise lr_fi_true = LinearRegression(fit_intercept=True) lr_fi_false = LinearRegression(fit_intercept=False) lr_fi_true.fit(X, y) lr_fi_false.fit(X, y) print('Intercept when fit_intercept=True : {:.5f}'.format(lr_fi_true.intercept_)) print('Intercept when fit_intercept=False : {:.5f}'.format(lr_fi_false.intercept_)) lr_fi_true_yhat = np.dot(X, lr_fi_true.coef_) + lr_fi_true.intercept_ lr_fi_false_yhat = np.dot(X, lr_fi_false.coef_) + lr_fi_false.intercept_ plt.scatter(X, y, label='Actual points') plt.plot(X, lr_fi_true_yhat, 'r--', label='fit_intercept=True') plt.plot(X, lr_fi_false_yhat, 'r-', label='fit_intercept=False') plt.legend() plt.vlines(0, 0, y.max()) plt.hlines(bias, X.min(), X.max()) plt.hlines(0, X.min(), X.max()) plt.show()
This example prints:
Intercept when fit_intercept=True : 100.32210 Intercept when fit_intercept=False : 0.00000
Visually it becomes clear what fit_intercept
does. When fit_intercept=True
, the line of best fit is allowed to "fit" the y-axis (close to 100 in this example). When fit_intercept=False
, the intercept is forced to the origin (0, 0).
What happens if I include a column of ones or zeros and set
fit_intercept
to True or False?
Below shows an example of how to inspect this.
from sklearn.linear_model import LinearRegression from sklearn.datasets import make_regression import numpy as np import matplotlib.pyplot as plt np.random.seed(1) bias = 100 X = np.arange(1000).reshape(-1,1) y_true = np.ravel(X.dot(0.3) + bias) noise = np.random.normal(0, 60, 1000) y = y_true + noise # with column of ones X_with_ones = np.hstack((np.ones((X.shape[0], 1)), X)) for b,data in ((True, X), (False, X), (True, X_with_ones), (False, X_with_ones)): lr = LinearRegression(fit_intercept=b) lr.fit(data, y) print(lr.intercept_, lr.coef_)
Take-away:
# fit_intercept=True, no column of zeros or ones 104.156765787 [ 0.29634031] # fit_intercept=False, no column of zeros or ones 0.0 [ 0.45265361] # fit_intercept=True, column of zeros or ones 104.156765787 [ 0. 0.29634031] # fit_intercept=False, column of zeros or ones 0.0 [ 104.15676579 0.29634031]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With