In R, what is the functionality of probability=TRUE
in the svm
function of the e1071
package?
model <- svm (Type ~ ., data, probability=TRUE, cost = 100, gamma = 1)
Setting the probability argument to TRUE for both model fitting and prediction returns, for each prediction, the vector of probabilities of belonging to each class of the response variable. These are stored in a matrix, as an attribute of the prediction object.
One standard way to obtain a “probability” out of an SVM is to use Platt scaling, which is available in many decent SVM implementations. In the binary case, the probabilities are calibrated using Platt scaling: logistic regression on the SVM's scores, fit by an additional cross-validation on the training data.
e1071 is a package for R programming that provides functions for statistic and probabilistic algorithms like a fuzzy classifier, naive Bayes classifier, bagged clustering, short-time Fourier transform, support vector machine, etc.. When it comes to SVM, there are many packages available in R to implement it.
To use SVM in R, we have a package e1071. The package is not preinstalled, hence one needs to run the line “install.
Setting the probability
argument to TRUE
for both model fitting and prediction returns, for each prediction, the vector of probabilities of belonging to each class of the response variable. These are stored in a matrix, as an attribute of the prediction object.
For example:
library(e1071)
model <- svm(Species ~ ., data = iris, probability=TRUE)
# (below I'm just predicting to the training dataset - it could of course just
# as easily be a separate test dataset)
pred <- predict(model, iris, probability=TRUE)
head(attr(pred, "probabilities"))
# setosa versicolor virginica
# 1 0.9803339 0.01129740 0.008368729
# 2 0.9729193 0.01807053 0.009010195
# 3 0.9790435 0.01192820 0.009028276
# 4 0.9750030 0.01531171 0.009685342
# 5 0.9795183 0.01164689 0.008834838
# 6 0.9740730 0.01679643 0.009130620
Note, however, that it's important to set probability=TRUE
for the call to svm
, and not just the call to predict
, since the latter alone would produce:
# setosa versicolor virginica
# 1 0.3333333 0.3333333 0.3333333
# 2 0.3333333 0.3333333 0.3333333
# 3 0.3333333 0.3333333 0.3333333
# 4 0.3333333 0.3333333 0.3333333
# 5 0.3333333 0.3333333 0.3333333
# 6 0.3333333 0.3333333 0.3333333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With