In scikit-learn's PolynomialFeatures preprocessor, there is an option to include_bias. This essentially just adds a column of ones to the dataframe. I was wondering what the point of having this was. Of course, you can set it to False. But theoretically how does having or not having a column of ones along with the Polynomial Features generated affect Regression.
This is the explanation in the documentation, but I can't seem to get anything useful out of it relation to why it should be used or not.
include_bias : boolean
If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).
Suppose you want to perform the following regression:
y ~ a + b x + c x^2
where x
is a generic sample. The best coefficients a,b,c
are computed via simple matricial calculus. First, let us denote with X = [1 | X | X^2]
a matrix with N rows, where N is the number of samples. The first column is a column of 1s, the second column is a column of values x_i
, for all the samples i, the third column is a column of values x_i^2
, for all samples i. Let us denote with B the following column vector B=[a b c]^T
If Y is a column vector of the N target values for all samples i, we can write the regression as
y ~ X B
The i
-th row of this equation is y_i ~ [1 x_i x^2] [a b c]^t = a + b x_i + c x_i^2
.
The goal of training a regression is to find B=[a b c]
such that X B
be as close as possible to y
.
If you don't add a column of 1
, you are assuming a-priori that a=0
, which might not be correct.
In practice, when you write Python code, and you use PolynomialFeatures
together with sklearn.linear_model.LinearRegression
, the latter takes care by default of adding a column of 1s (since in LinearRegression
the fit_intercept
parameter is True
by default), so you don't need to add it as well in PolynomialFeatures
. Therefore, in PolynomialFeatures
one usually keeps include_bias=False
.
The situation is different if you use statsmodels.OLS
instead of LinearRegression
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With