Major assumptions of machine learning classifiers (LG, SVM, and decision trees)

Tags:

In classical statistics, people usually state what assumptions are assumed (i.e. normality and linearity of data, independence of data). But when I am reading machine learning textbooks and tutorials, the underlying assumptions are not always explicitly or completely stated. What are the major assumptions of the following ML classifiers for binary classification, and which ones are not so important to uphold and which one must be uphold strictly?

Logistic regression
Support vector machine (linear and non-linear kernel)
Decision trees

247

asked Feb 16 '16 01:02

KubiK888

2 Answers

IID is the fundamental assumption of almost all statistical learning methods.

Logistic Regression is a special case of GLM(generalized linear model). So despite some technique requirements, the most strict restriction lies in the specific distribution of data distribution. Data MUST has a distribution in exponential family. You can dig deeper in https://en.wikipedia.org/wiki/Generalized_linear_model, and Stanford CS229 lecture note1 also has a excellent coverage of this topic.

SVM is quite tolerant of input data, especially the soft-margin version. I can not remember any specific assumption of data is taken(please correct).

Decision tree tells the same story as SVM.

answered Oct 24 '22 07:10

RogerTR

Great question.

Logistic Regression also assumes the following:

That there isn't (or there is little) multicollinearity (high correlation) among the independent variables.
Even though LR doesn't require the dependent and independent variables to be linearly related, it does however require that the independent variables to be linearly related to the log odds. The log odds function is simply log(p/1-p).

answered Oct 24 '22 06:10

msarafzadeh

Related questions
                            
                                How to split data based on a column value in sklearn
                            
                                How can I load a partial pretrained pytorch model?
                            
                                mask 0 values during normalization
                            
                                AWS - Step functions, use execution input within a TuningStep
                            
                                Choice of Machine Learning Platform [closed]
                            
                                Clustering conceptually similar documents together?
                            
                                Which is the best document clustering open-source package?
                            
                                Effective clustering of a similarity matrix
                            
                                Getting negative alpha value in SVM using scikit package in python
                            
                                Mahout for sentiment analysis
                            
                                Struggling with BFGS minimization algorithm for Logistic regression in Clojure with Incanter
                            
                                Matching template imge(scaled) to Main/larger image
                            
                                Conceptual issues on training neural network wih particle swarm optimization
                            
                                Training Algorithm to train this data
                            
                                Virtual Testing Environment for Drones [closed]
                            
                                How do we get/define filters in convolutional neural networks?
                            
                                How to save/export a Spark ML Lib model to PMML?
                            
                                scitkit-learn query data dimension must match training data dimension
                            
                                XOR gate with a neural network
                            
                                How to calculate the click-through rate

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Major assumptions of machine learning classifiers (LG, SVM, and decision trees)

Tags:

machine-learning

svm

logistic-regression

decision-tree

KubiK888

People also ask

2 Answers

RogerTR

msarafzadeh

Recent Activity

Donate For Us