Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Naive bayes in R

Tags:

r

I am getting an error while running naive bayes classifier in R. I am using the following code-

mod1 <- naiveBayes(factor(X20) ~ factor(X1) + factor(X2) +factor(X3) +factor(X4)+factor(X5)+factor(X6)+factor(X7)
               +factor(X8)+factor(X9)
               +factor(X10)+factor(X11)+ factor(X12)+factor(X13)+factor(X14)
               +factor(X15)
               +factor(X16)+factor(X17)
               +factor(X18)+factor(X19),data=intent.test)

res1 <- predict(mod1)$posterior

First part of this code runs fine. But when it try to predict the posterior probability it throws following error-

**Error in as.data.frame(newdata) : 
argument "newdata" is missing, with no default**

I tried running something like

res1 <- predict(mod1,new_data=intent.test)$posterior

but this also gives the same error.

like image 765
SumitGupta Avatar asked Jan 18 '23 13:01

SumitGupta


1 Answers

You seem to be using the e1071::naiveBayes algorithm, which expects a newdata argument for prediction, hence the two errors raised when running your code. (You can check the source code of the predict.naiveBayes function on CRAN; the second line in the code is expecting a newdata, as newdata <- as.data.frame(newdata).) Also as pointed out by @Vincent, you're better off converting your variables to factor before calling the NB algorithm, although this has certainly nothing to do with the above errors.

Using NaiveBayes from the klar package, no such problem would happen. E.g.,

data(spam, package="ElemStatLearn")
library(klaR)

# set up a training sample
train.ind <- sample(1:nrow(spam), ceiling(nrow(spam)*2/3), replace=FALSE)

# apply NB classifier
nb.res <- NaiveBayes(spam ~ ., data=spam[train.ind,])

# predict on holdout units
nb.pred <- predict(nb.res, spam[-train.ind,])

# but this also works on the training sample, i.e. without using a `newdata`
head(predict(nb.res))
like image 174
chl Avatar answered Jan 24 '23 10:01

chl