in coursera course for machine learning https://share.coursera.org/wiki/index.php/ML:Linear_Regression_with_Multiple_Variables#Gradient_Descent_for_Multiple_Variables, it says gradient descent should converge.
I m using Linear regression from scikit learn. It doesn't provide gradient descent info. I have seen many questions on stackoverflow to implement linear regression with gradient descent.
How do we use Linear regression from scikit-learn in real world? OR Why does scikit-learn doesn't provide gradient descent info in linear regression output?
A Linear Regression model converging to optimum solution using Gradient Descent. However, the sklearn Linear Regression doesn't use gradient descent.
As for this question, linear Regression is just a model/technique of regression where as Gradient Descent is the algorithm.
Gradient Descent Algorithm gives optimum values of m and c of the linear regression equation. With these values of m and c, we will get the equation of the best-fit line and ready to make predictions.
Linear regression is defined as the process of determining the straight line that best fits a set of dispersed data points: The line can then be projected to forecast fresh data points. Because of its simplicity and essential features, linear regression is a fundamental Machine Learning method.
Scikit learn provides you two approaches to linear regression:
LinearRegression
object uses Ordinary Least Squares solver from scipy, as LR is one of two classifiers which have closed form solution. Despite the ML course - you can actually learn this model by just inverting and multiplicating some matrices.
SGDRegressor
which is an implementation of stochastic gradient descent, very generic one where you can choose your penalty terms. To obtain linear regression you choose loss to be L2
and penalty also to none
(linear regression) or L2
(Ridge regression)
There is no "typical gradient descent" because it is rarely used in practise. If you can decompose your loss function into additive terms, then stochastic approach is known to behave better (thus SGD) and if you can spare enough memory - OLS method is faster and easier (thus first solution).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With