Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Predicted probabilities in R ranger package

I am trying to build a model in R with random forest classification. (By editing the code by Ned Horning) I first used randomForest package but then found ranger, which promises faster calculations.

At first, I used the code below to get predicted probabilities for each class after fitting the model with randomForest as:

predProbs <- as.data.frame(predict(randfor, imageBlock, type='prob'))

The type of probability here is as follows:

We have 500 trees in the model and 250 of them says the observation is class 1, hence the probability is 250/500 = 50%

In ranger, I realized that there is no type = 'prob' option.

I searched and tried some adjustments but couldn't get any progress. I need an object or so containing probabilities as mentioned above with ranger package.

Could anyone give some advice about the issue?

like image 354
Batuhan Kavlak Avatar asked Apr 12 '19 15:04

Batuhan Kavlak


People also ask

How do you find probability of prediction?

Theoretical probability uses math to predict the outcomes. Just divide the favorable outcomes by the possible outcomes. Experimental probability is based on observing a trial or experiment, counting the favorable outcomes, and dividing it by the total number of times the trial was performed.

What are predicted probabilities?

Predicted probabilities are calibrated or scaled to reflect the observed occurence of class 1 events by an additional regressor that uses the initial probabilities to predict the target variable that is the true probabilities.

Does random forest give probability?

A random forest is a popular tool for estimating probabilities in machine learning classification tasks. However, the means by which this is accomplished is unprincipled: one simply counts the fraction of trees in a forest that vote for a certain class.

How do you find the probability of a random forest in R?

The predict() returns the true probability for each class based on votes by all the trees. Using randomForest(x,y,xtest=x,ytest=y) functions, passing a formula or simply randomForest(x,y). randomForest(x,y,xtest=x,ytest=y) would return the probability for each class.


1 Answers

You need to train a "probabilistic classifier"-type ranger object:

library("ranger")
iris.ranger = ranger(Species ~ ., data = iris, probability = TRUE)

This object computes a matrix (n_samples, n_classes) when used in the predict.ranger function:

probabilities = predict(iris.ranger, data = iris)$predictions
like image 50
user1808924 Avatar answered Oct 14 '22 01:10

user1808924