Seems like a basic question, but I need to use feature scaling (take each feature value, subtract the mean then divide by the standard deviation) in my implementation of linear regression with gradient descent. After I'm finished, I'd like the weights and regression line rescaled to the original data. I'm only using one feature, plus the y-intercept term. How would I change the weights, after I get them using the scaled data, so that they apply to the original unscaled data?
Centering/scaling does not affect your statistical inference in regression models — the estimates are adjusted appropriately and the p-values will be the same.
Summary. We need to perform Feature Scaling when we are dealing with Gradient Descent Based algorithms (Linear and Logistic Regression, Neural Network) and Distance-based algorithms (KNN, K-means, SVM) as these are very sensitive to the range of the data points.
If feature scaling is not done, then a machine learning algorithm tends to weigh greater values, higher and consider smaller values as the lower values, regardless of the unit of the values.
For example, to find the best parameter values of a linear regression model, there is a closed-form solution, called the Normal Equation. If your implementation makes use of that equation, there is no stepwise optimization process, so feature scaling is not necessary.
Suppose your regression is y = W*x + b
with x
the scaled data, with the original data it is
y = W/std * x0 + b - u/std * W
where u
and std
are mean value and standard deviation of x0
. Yet I don't think you need to transform back the data. Just use the same u
and std
to scale the new test data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With