The explanation of the verbose mode during running randomForest in R

Question

I am running randomForest in R with the verbose mode(do.trace), and I was wondering what the meanings of columns in the message are. I can see ntree is number of trees, and OOB is the % of out of bag samples, but what are "1" and "2" ?

> rf.m <- randomForest(x = X.train, y=as.factor(y.train), do.trace=10)
ntree      OOB      1      2
   10:  32.03% 15.60% 82.47%
   20:  29.18% 10.51% 86.31%
   30:  27.44%  7.47% 88.57%
   40:  26.48%  5.29% 91.33%
   50:  25.92%  4.35% 91.96%
   ....

eipi10 · Accepted Answer

Columns 1 and 2 in the output give the classification error for each class. The OOB value is the weighted average of the class errors (weighted by the fraction of observations in each class).

An example (adapting the random forest example from the help page):

# Keep every 100th tree in the trace
set.seed(71)
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
                        proximity=TRUE, do.trace=100)

ntree      OOB      1      2      3
  100:   6.00%  0.00%  8.00% 10.00%
  200:   5.33%  0.00%  6.00% 10.00%
  300:   6.00%  0.00%  8.00% 10.00%
  400:   4.67%  0.00%  8.00%  6.00%
  500:   5.33%  0.00%  8.00%  8.00%

The weighted average of the class errors for the 100th tree gives an OOB error rate of 6.0%, exactly as reported in the trace above. (prop.table returns the fraction of observations in each category (each class) of species). We multiply that element-wise by the class errors for the 100th tree, as given in the trace values above, and then sum to get the weighted average error over all classes (the OOB error).

sum(prop.table(table(iris$Species)) * c(0, 0.08, 0.10))
[,1]
[1,] 0.06

You can avoid needing to use sum if you use matrix multiplication, which here is equivalent to the dot/scalar/inner product:

prop.table(table(iris$Species)) %*% c(0, 0.08, 0.10)

The explanation of the verbose mode during running randomForest in R

Tags:

r

random-forest

Alby

1 Answers

eipi10

Recent Activity

Donate For Us

The explanation of the verbose mode during running randomForest in R

Tags:

r

random-forest

Alby

1 Answers

eipi10

Related questions

Recent Activity

Donate For Us