Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In gbm multinomial dist, how to use predict to get categorical output? [duplicate]

My response is a categorical variable (some alphabets), so I used distribution='multinomial' when making the model, and now I want to predict the response and obtain the output in terms of these alphabets, instead of matrix of probabilities.

However in predict(model, newdata, type='response'), it gives probabilities, same as the result of type='link'.

Is there a way to obtain categorical outputs?

BST = gbm(V1~.,data=training,distribution='multinomial',n.trees=2000,interaction.depth=4,cv.folds=5,shrinkage=0.005)

predBST = predict(BST,newdata=test,type='response')
like image 684
shavendy Avatar asked Apr 05 '15 06:04

shavendy


1 Answers

In predict.gbm documentation, it is mentioned:

If type="response" then gbm converts back to the same scale as the outcome. Currently the only effect this will have is returning probabilities for bernoulli and expected counts for poisson. For the other distributions "response" and "link" return the same.

What you should do, as Dominic suggests, is to pick the response with the highest probability from the resulting predBST matrix, by doing apply(.., 1, which.max) on the vector output from prediction. Here is a code sample with the iris dataset:

library(gbm)

data(iris)

df <- iris[,-c(1)] # remove index

df <- df[sample(nrow(df)),]  # shuffle

df.train <- df[1:100,]
df.test <- df[101:150,]

BST = gbm(Species~.,data=df.train,
         distribution='multinomial',
         n.trees=200,
         interaction.depth=4,
         #cv.folds=5,
         shrinkage=0.005)

predBST = predict(BST,n.trees=200, newdata=df.test,type='response')

p.predBST <- apply(predBST, 1, which.max)

> predBST[1:6,,]
     setosa versicolor  virginica
[1,] 0.89010862 0.05501921 0.05487217
[2,] 0.09370400 0.45616148 0.45013452
[3,] 0.05476228 0.05968445 0.88555327
[4,] 0.05452803 0.06006513 0.88540684
[5,] 0.05393377 0.06735331 0.87871292
[6,] 0.05416855 0.06548646 0.88034499

 > head(p.predBST)
 [1] 1 2 3 3 3 3
like image 140
desertnaut Avatar answered Nov 19 '22 02:11

desertnaut