why gradient descent when we can solve linear regression analytically

Tags:

what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so why we still want to use gradient descent to do the same thing? thanks

203

asked Aug 12 '13 16:08

John

2 Answers

When you use the normal equations for solving the cost function analytically you have to compute:

enter image description here

Where X is your matrix of input observations and y your output vector. The problem with this operation is the time complexity of calculating the inverse of a nxn matrix which is O(n^3) and as n increases it can take a very long time to finish.

When n is low (n < 1000 or n < 10000) you can think of normal equations as the better option for calculation theta, however for greater values Gradient Descent is much more faster, so the only reason is the time :)

answered Oct 08 '22 13:10

jabaldonedo

You should provide more details about yout problem - what exactly are you asking about - are we talking about linear regression in one or many dimensions? Simple or generalized ones?

In general, why do people use the GD?

it is easy to implement
it is very generic optimization technique - even if you change your model to the more general one, you can stil use it

So what about analytical solutions? Well, we do use them, your claim is simply false here (if we are talking in general), for example the OLS method is a closed form, analytical solution, which is widely used. If you can use the analytical solution, it is affordable computationaly (as sometimes GD is simply cheapier or faster) then you can, and even should - use it.

Neverlethles this is always a matter of some pros and cons - analytical solutions are strongly connected to the model, so implementing them can be inefficient if you plan to generalize/change your models in the future. They are sometimes less efficient then their numerical approximations, and sometimes there are simply harder to implement. If none of above is true - you should use the analytical solution, and people do it, really.

To sum up, you rather use GD over analytical solution if:

you are considering changes in the model, generalizations, adding some more complex terms/regularization/modifications
you need a generic method because you do not know much about the future of the code and the model (you are only one of the developers)
analytical solution is more expensive computationaly, and you need efficiency
analytical solution requires more memory, which you do not have
analytical solution is hard to implement and you need easy, simple code

answered Oct 08 '22 12:10

lejlot

Related questions
                            
                                Training a Neural Network with Reinforcement learning
                            
                                How to interpret Poolallocator messages in tensorflow?
                            
                                classifiers in scikit-learn that handle nan/null
                            
                                Perceptron learning algorithm not converging to 0
                            
                                Keras model.summary() result - Understanding the # of Parameters
                            
                                Keras model.summary() object to string
                            
                                Higher validation accuracy, than training accurracy using Tensorflow and Keras
                            
                                TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?
                            
                                What is the difference between Gradient Descent and Newton's Gradient Descent?
                            
                                Different result with roc_auc_score() and auc()
                            
                                SVM - hard or soft margins?
                            
                                Does Any one got "AttributeError: 'str' object has no attribute 'decode' " , while Loading a Keras Saved Model
                            
                                Linear regression analysis with string/categorical features (variables)?
                            
                                Machine learning in OCaml or Haskell?
                            
                                Tensorflow One Hot Encoder?
                            
                                Ways to improve the accuracy of a Naive Bayes Classifier?
                            
                                What is out of bag error in Random Forests? [closed]
                            
                                Pattern recognition in time series [closed]
                            
                                How to get most informative features for scikit-learn classifiers?
                            
                                Mixing categorial and continuous data in Naive Bayes classifier using scikit-learn

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

why gradient descent when we can solve linear regression analytically

Tags:

machine-learning

linear-regression

gradient-descent

John

People also ask

2 Answers

jabaldonedo

lejlot

Recent Activity

Donate For Us