Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

wrong model type for regression error in 10 fold cross validation for Naive Bayes using R

I am implementing 10 fold cross validation for Naive Bayes on some test data with 2 classes(0 and 1). I followed below steps and getting error.

data(testdata)

attach(testdata)

X <- subset(testdata, select=-Class)

Y <- Class

library(e1071)

naive_bayes <- naiveBayes(X,Y)

library(caret)
library(klaR)

nb_cv <- train(X, Y, method = "nb", trControl = trainControl(method = "cv", number = 10))

## Error:
## Error in train.default(X, Y, method = "nb", trControl = trainControl(number = 10)) : 
## wrong model type for regression


dput(testdata)

structure(list(Feature.1 = 6.534088, Feature.2 = -19.050915, 
Feature.3 = 7.599378, Feature.4 = 5.093594, Feature.5 = -22.15166, 
Feature.6 = -7.478444, Feature.7 = -59.534652, Feature.8 = -1.587918, 
Feature.9 = -5.76889, Feature.10 = 95.810563, Feature.11 = 49.124086, 
Feature.12 = -21.101489, Feature.13 = -9.187984, Feature.14 = -10.53006, 
Feature.15 = -3.782506, Feature.16 = -10.805074, Feature.17 = 34.039509, 
Feature.18 = 5.64245, Feature.19 = 19.389724, Feature.20 = 16.450196, 
Class = 1L), .Names = c("Feature.1", "Feature.2", "Feature.3", 
"Feature.4", "Feature.5", "Feature.6", "Feature.7", "Feature.8", 
"Feature.9", "Feature.10", "Feature.11", "Feature.12", "Feature.13", 
"Feature.14", "Feature.15", "Feature.16", "Feature.17", "Feature.18", 
"Feature.19", "Feature.20", "Class"), class = "data.frame", row.names = c(NA, 
-1L))

Also, how to calculare R square or AUC for this model

Dataset: There are 10000 records with 20 features and Binary class.

like image 928
Shivraj Nimbalkar Avatar asked Apr 29 '14 07:04

Shivraj Nimbalkar


People also ask

Can we use cross validation for linear regression?

This notebook demonstrates how to do cross-validation (CV) with linear regression as an example (it is heavily used in almost all modelling techniques such as decision trees, SVM etc.). We will mainly use sklearn to do cross-validation.


1 Answers

NaiveBayes is a classifier and hence converting Y to a factor or boolean is the right way to tackle the problem. Your original formulation was using a classifier tool but using numeric values and hence R was confused.

As far as R-square is concerned, again that metric is only computed for Regression problems not classification problems. To evaluate classification problems there are other metrics like Precision and Recall.

Please refer to the wikipedia link for more information on these metrics: http://en.wikipedia.org/wiki/Binary_classification

like image 87
user3585718 Avatar answered Jan 03 '23 19:01

user3585718