I am implementing 10 fold cross validation for Naive Bayes on some test data with 2 classes(0 and 1). I followed below steps and getting error.
data(testdata)
attach(testdata)
X <- subset(testdata, select=-Class)
Y <- Class
library(e1071)
naive_bayes <- naiveBayes(X,Y)
library(caret)
library(klaR)
nb_cv <- train(X, Y, method = "nb", trControl = trainControl(method = "cv", number = 10))
## Error:
## Error in train.default(X, Y, method = "nb", trControl = trainControl(number = 10)) :
## wrong model type for regression
dput(testdata)
structure(list(Feature.1 = 6.534088, Feature.2 = -19.050915,
Feature.3 = 7.599378, Feature.4 = 5.093594, Feature.5 = -22.15166,
Feature.6 = -7.478444, Feature.7 = -59.534652, Feature.8 = -1.587918,
Feature.9 = -5.76889, Feature.10 = 95.810563, Feature.11 = 49.124086,
Feature.12 = -21.101489, Feature.13 = -9.187984, Feature.14 = -10.53006,
Feature.15 = -3.782506, Feature.16 = -10.805074, Feature.17 = 34.039509,
Feature.18 = 5.64245, Feature.19 = 19.389724, Feature.20 = 16.450196,
Class = 1L), .Names = c("Feature.1", "Feature.2", "Feature.3",
"Feature.4", "Feature.5", "Feature.6", "Feature.7", "Feature.8",
"Feature.9", "Feature.10", "Feature.11", "Feature.12", "Feature.13",
"Feature.14", "Feature.15", "Feature.16", "Feature.17", "Feature.18",
"Feature.19", "Feature.20", "Class"), class = "data.frame", row.names = c(NA,
-1L))
Also, how to calculare R square or AUC for this model
Dataset: There are 10000 records with 20 features and Binary class.
This notebook demonstrates how to do cross-validation (CV) with linear regression as an example (it is heavily used in almost all modelling techniques such as decision trees, SVM etc.). We will mainly use sklearn to do cross-validation.
NaiveBayes is a classifier and hence converting Y to a factor or boolean is the right way to tackle the problem. Your original formulation was using a classifier tool but using numeric values and hence R was confused.
As far as R-square is concerned, again that metric is only computed for Regression problems not classification problems. To evaluate classification problems there are other metrics like Precision and Recall.
Please refer to the wikipedia link for more information on these metrics: http://en.wikipedia.org/wiki/Binary_classification
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With