scikit-learn library has following classifiers which look similar: <ul> <li>Logistic regression classifier has different solvers and one of them is 'sgd'</li> </ul> http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression <ul> <li>It also has a different classifier 'SGDClassifier' and the loss parameter can be mentioned as 'log' for logistic regression.</li> </ul> http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html#sklearn.linear_model.SGDClassifier Are they essentially same or different? If they are different, how different is the implementation between two? And how do you decide which one to use given the problem of logistic regression?

Logistic Regression in Sklearn doesn't have a 'sgd' solver though. It implements a log regularized logistic regression : it minimizes the log-probability. SGDClassifier is a generalized linear classifier that will use Stochastic Gradient Descent as a solver. As it is mentionned here http://scikit-learn.org/stable/modules/sgd.html : "Even though SGD has been around in the machine learning community for a long time, it has received a considerable amount of attention just recently in the context of large-scale learning." It is easy to implement and efficient. For example, this is one of the solvers that is used for Neural Networks. With SGDClassifier you can use lots of different loss functions (a function to minimize or maximize to find the optimum solution) that allows you to "tune" your model and find the best sgd based linear model for your data. Indeed, some data structures or some problems will need different loss functions. In your example, the SGD classifier will have the same loss function as the Logistic Regression but a different solver. Depending on your data, you can have different results. You may try to find the best one using cross validation or even try a grid search cross validation to find the best hyper-parameters. Hope that answers your questions.

SGDClassifier vs LogisticRegression with sgd solver in scikit-learn library

3 Answers

Logistic Regression in Sklearn doesn't have a 'sgd' solver though. It implements a log regularized logistic regression : it minimizes the log-probability.

SGDClassifier is a generalized linear classifier that will use Stochastic Gradient Descent as a solver. As it is mentionned here http://scikit-learn.org/stable/modules/sgd.html : "Even though SGD has been around in the machine learning community for a long time, it has received a considerable amount of attention just recently in the context of large-scale learning." It is easy to implement and efficient. For example, this is one of the solvers that is used for Neural Networks.

With SGDClassifier you can use lots of different loss functions (a function to minimize or maximize to find the optimum solution) that allows you to "tune" your model and find the best sgd based linear model for your data. Indeed, some data structures or some problems will need different loss functions.

In your example, the SGD classifier will have the same loss function as the Logistic Regression but a different solver. Depending on your data, you can have different results. You may try to find the best one using cross validation or even try a grid search cross validation to find the best hyper-parameters.

Hope that answers your questions.

184

answered Oct 24 '22 09:10

Mohamed AL ANI

Basically, SGD is like an umbrella capable to facing different linear functions. SGD is an approximation algorithm like taking single single points and as the number of point increases it converses more to the optimal solution. Therefore, it is mostly used when the dataset is large. Logistic Regression uses Gradient descent by default so its slower (if compared on large dataset) To make SGD perform well for any particular linear function, lets say here logistic Regression we tune the parameters called hyperparameter tuning

answered Oct 24 '22 07:10

shantanu kumar

All linear classifiers(SVM, logistic regression, a.o.) can use the sgd: Stochastic Gradient Descent

answered Oct 24 '22 09:10

chenhong

Related questions
                            
                                How to upgrade to python 3.5 from 2.7 in Mac OSX
                            
                                How to use scipy.optimize.minimize function when you want to compute gradient along with the objective function?
                            
                                Printing float up to six decimal places [duplicate]
                            
                                Error with pip install scikit-image
                            
                                Unpacking Python's Type Annotations
                            
                                Histogram with equal number of points in each bin
                            
                                Pandas NaN introduced by pivot_table
                            
                                python3 print to string
                            
                                Pandas groupby object in legend on plot
                            
                                Best way to convert fractions.Fraction to decimal.Decimal?
                            
                                Python 3 Decimal rounding half down with ROUND_HALF_UP context
                            
                                Intraclass Correlation in Python Module?
                            
                                Docker compose installing requirements.txt
                            
                                Django with NoSQL database
                            
                                pyspark parse fixed width text file
                            
                                Python 3.5 typed NamedTuple syntax produces SyntaxError
                            
                                TimeDistributed vs. TimeDistributedDense Keras
                            
                                Scipy sparse matrix multiplication
                            
                                supply a filename for a file-like object created by urlopen() or requests.get()
                            
                                convert python datetime with timezone to string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SGDClassifier vs LogisticRegression with sgd solver in scikit-learn library

Tags:

python

machine-learning

scikit-learn

coffeebytes

People also ask

3 Answers

Mohamed AL ANI

shantanu kumar

chenhong

Recent Activity

Donate For Us