Correlated features and classification accuracy

Tags:

I'd like to ask everyone a question about how correlated features (variables) affect the classification accuracy of machine learning algorithms. With correlated features I mean a correlation between them and not with the target class (i.e the perimeter and the area of a geometric figure or the level of education and the average income). In my opinion correlated features negatively affect eh accuracy of a classification algorithm, I'd say because the correlation makes one of them useless. Is it truly like this? Does the problem change with the respect of the classification algorithm type? Any suggestion on papers and lectures are really welcome! Thanks

975

asked Feb 11 '13 14:02

Titus Pullo

1 Answers

Correlated features do not affect classification accuracy per se. The problem in realistic situations is that we have a finite number of training examples with which to train a classifier. For a fixed number of training examples, increasing the number of features typically increases classification accuracy to a point but as the number of features continue to increase, classification accuracy will eventually decrease because we are then undersampled relative to the large number of features. To learn more about the implications of this, look at the curse of dimensionality.

If two numerical features are perfectly correlated, then one doesn't add any additional information (it is determined by the other). So if the number of features is too high (relative to the training sample size), then it is beneficial to reduce the number of features through a feature extraction technique (e.g., via principal components)

The effect of correlation does depend on the type of classifier. Some nonparametric classifiers are less sensitive to correlation of variables (although training time will likely increase with an increase in the number of features). For statistical methods such as Gaussian maximum likelihood, having too many correlated features relative to the training sample size will render the classifier unusable in the original feature space (the covariance matrix of the sample data becomes singular).

200

answered Oct 08 '22 15:10

bogatron

Related questions
                            
                                Sklearn StratifiedKFold: ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead
                            
                                What is the meaning of the "None" in model.summary of KERAS?
                            
                                What is a multi-headed model? And what exactly is a 'head' in a model?
                            
                                Candidate Elimination Algorithm
                            
                                Determining the most contributing features for SVM classifier in sklearn
                            
                                scikit-learn return value of LogisticRegression.predict_proba
                            
                                What is "metrics" in Keras?
                            
                                What is `lr_policy` in Caffe?
                            
                                Unknown initializer: GlorotUniform when loading Keras model
                            
                                What are the differences between all these cross-entropy losses in Keras and TensorFlow?
                            
                                Shuffling training data with LSTM RNN
                            
                                What does clf mean in machine learning?
                            
                                Suggest what user could buy if he already has something in the cart
                            
                                importance of PCA or SVD in machine learning
                            
                                TensorFlow operator overloading
                            
                                How to understand the term `tensor` in TensorFlow?
                            
                                Neural Networks: What does "linearly separable" mean?
                            
                                xgboost in R: how does xgb.cv pass the optimal parameters into xgb.train
                            
                                How to pick a language for Artificial Intelligence programming? [closed]
                            
                                ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Correlated features and classification accuracy

Tags:

machine-learning

classification

correlation

feature-selection

Titus Pullo

People also ask

1 Answers

bogatron

Recent Activity

Donate For Us