Calculating prediction accuracy of a tree using rpart's predict method

Tags:

I have constructed a decision tree using rpart for a dataset.

I have then divided the data into 2 parts - a training dataset and a test dataset. A tree has been constructed for the dataset using the training data. I want to calculate the accuracy of the predictions based on the model that was created.

My code is shown below:

library(rpart)
#reading the data
data = read.table("source")
names(data) <- c("a", "b", "c", "d", "class")

#generating test and train data - Data selected randomly with a 80/20 split
trainIndex  <- sample(1:nrow(x), 0.8 * nrow(x))
train <- data[trainIndex,]
test <- data[-trainIndex,]

#tree construction based on information gain
tree = rpart(class ~ a + b + c + d, data = train, method = 'class', parms = list(split = "information"))

I now want to calculate the accuracy of the predictions generated by the model by comparing the results with the actual values train and test data however I am facing an error while doing so.

My code is shown below:

t_pred = predict(tree,test,type="class")
t = test['class']
accuracy = sum(t_pred == t)/length(t)
print(accuracy)

I get an error message that states -

Error in t_pred == t : comparison of these types is not implemented In addition: Warning message: Incompatible methods ("Ops.factor", "Ops.data.frame") for "=="

On checking the type of t_pred, I found out that it is of type integer however the documentation

(https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/predict.rpart.html)

states that the predict() method must return a vector.

I am unable to understand why is the type of the variable is an integer and not a list. Where have I made the mistake and how can I fix it?

491

asked Oct 17 '16 07:10

Arat254

1 Answers

Try calculating the confusion matrix first:

confMat <- table(test$class,t_pred)

Now you can calculate the accuracy by dividing the sum diagonal of the matrix - which are the correct predictions - by the total sum of the matrix:

accuracy <- sum(diag(confMat))/sum(confMat)

182

answered Sep 20 '22 14:09

mtoto

Related questions
                            
                                Add a series of elements in different locations within a vector
                            
                                using dplyr's do() with summary()
                            
                                Are my R scripts identical?
                            
                                R indexing arrays. How to index 3 dimensional array by using a matrix for the 3rd dimension
                            
                                How to plot uploaded dataset using shiny?
                            
                                How to make graph color depend on two criteria in ggplot2?
                            
                                Set a header as the value of a variable in R markdown
                            
                                Adding scroll to sidebar in flexdashboard
                            
                                what is equivalent to do.call(rbind, list)?
                            
                                Rename variable names in dplyr based on vectors new_varname, old_varname [duplicate]
                            
                                See the specific color names from one existing palette in ggplot 2
                            
                                Given an element of a list, how do I recover its index inside the list?
                            
                                R:Inconsistent line thickness in geom_segment ggplot2
                            
                                For each value determining if another column contains larger or smaller number
                            
                                What is the equivalent of dplyr mutate and summarise in data table? [duplicate]
                            
                                set positive class to 1 in R
                            
                                ggplot: percentile lines by group automation
                            
                                do.call a function in R without loading the package [closed]
                            
                                How to compute diag(X %*% solve(A) %*% t(X)) efficiently without taking matrix inverse?
                            
                                Confidence intervals for Ridge regression

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculating prediction accuracy of a tree using rpart's predict method

Tags:

r

machine-learning

decision-tree

rpart

Arat254

People also ask

1 Answers

mtoto

Recent Activity

Donate For Us