Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Interpret Predict Result of SVM in R?

Tags:

I'm new to R and I'm using the e1071 package for SVM classification in R.

I used the following code:

data <- loadNumerical()  model <- svm(data[,-ncol(data)], data[,ncol(data)], gamma=10)  print(predict(model, data[c(1:20),-ncol(data)])) 

The loadNumerical is for loading data, and the data are of the form(first 8 columns are input and the last column is classification) :

   [,1] [,2] [,3] [,4] [,5] [,6] [,7]      [,8] [,9] 1    39    1   -1   43   -1    1    0 0.9050497    0 2    23   -1   -1   30   -1   -1    0 1.6624974    1 3    50   -1   -1   49    1    1    2 1.5571429    0 4    46   -1    1   19   -1   -1    0 1.3523685    0 5    36    1    1   29   -1    1    1 1.3812029    1 6    27   -1   -1   19    1    1    0 1.9403649    0 7    36   -1   -1   25   -1    1    0 2.3360004    0 8    41    1    1   23    1   -1    1 2.4899738    0 9    21   -1   -1   18    1   -1    2 1.2989637    1 10   39   -1    1   21   -1   -1    1 1.6121595    0 

The number of rows in the data is 500.

As shown in the code above, I tested the first 20 rows for prediction. And the output is:

         1          2          3          4          5          6          7  0.04906014 0.88230392 0.04910760 0.04910719 0.87302217 0.04898187 0.04909523           8          9         10         11         12         13         14  0.04909199 0.87224979 0.04913189 0.04893709 0.87812890 0.04909588 0.04910999          15         16         17         18         19         20  0.89837037 0.04903778 0.04914173 0.04897789 0.87572114 0.87001066  

I can tell intuitively from the result that when the result is close to 0, it means 0 class, and if it's close to 1 it's in the 1 class.

But my question is how can I precisely interpret the result: is there a threshold s I can use so that values below s are classified as 0 and values above s are classified as 1 ?

If there exists such s, how can I derive it ?

like image 258
Derrick Zhang Avatar asked Oct 16 '11 05:10

Derrick Zhang


1 Answers

Since your outcome variable is numeric, it uses the regression formulation of SVM. I think you want the classification formulation. You can change this by either coercing your outcome into a factor, or setting type="C-classification".

Regression:

> model <- svm(vs ~ hp+mpg+gear,data=mtcars) > predict(model)           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive         0.8529506670        0.8529506670        0.9558654451        0.8423224174    Hornet Sportabout             Valiant          Duster 360           Merc 240D         0.0747730699        0.6952501964        0.0123405904        0.9966162477             Merc 230            Merc 280           Merc 280C          Merc 450SE         0.9494836511        0.7297563543        0.6909235343       -0.0327165348           Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental        -0.0092851098       -0.0504982402        0.0319974842        0.0504292348    Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla        -0.0504750284        0.9769206963        0.9724676874        0.9494910097        Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28         0.9496260289        0.1349744908        0.1251344111        0.0395243313     Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa         0.0983094417        1.0041732099        0.4348209129        0.6349628695       Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E         0.0009258333        0.0607896408        0.0507385269        0.8664157985  

Classification:

> model <- svm(as.factor(vs) ~ hp+mpg+gear,data=mtcars) > predict(model)           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive                    1                   1                   1                   1    Hornet Sportabout             Valiant          Duster 360           Merc 240D                    0                   1                   0                   1             Merc 230            Merc 280           Merc 280C          Merc 450SE                    1                   1                   1                   0           Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental                    0                   0                   0                   0    Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla                    0                   1                   1                   1        Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28                    1                   0                   0                   0     Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa                    0                   1                   0                   1       Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E                    0                   0                   0                   1  Levels: 0 1 

Also, if you want probabilities as your prediction rather than just the raw classification, you can do that by fitting with the probability option.

With Probabilities:

> model <- svm(as.factor(vs) ~ hp+mpg+gear,data=mtcars,probability=TRUE) > predict(model,mtcars,probability=TRUE)           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive                    1                   1                   1                   1    Hornet Sportabout             Valiant          Duster 360           Merc 240D                    0                   1                   0                   1             Merc 230            Merc 280           Merc 280C          Merc 450SE                    1                   1                   1                   0           Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental                    0                   0                   0                   0    Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla                    0                   1                   1                   1        Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28                    1                   0                   0                   0     Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa                    0                   1                   0                   1       Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E                    0                   0                   0                   1  attr(,"probabilities")                             0          1 Mazda RX4           0.2393753 0.76062473 Mazda RX4 Wag       0.2393753 0.76062473 Datsun 710          0.1750089 0.82499108 Hornet 4 Drive      0.2370382 0.76296179 Hornet Sportabout   0.8519490 0.14805103 Valiant             0.3696019 0.63039810 Duster 360          0.9236825 0.07631748 Merc 240D           0.1564898 0.84351021 Merc 230            0.1780135 0.82198650 Merc 280            0.3402143 0.65978567 Merc 280C           0.3829336 0.61706640 Merc 450SE          0.9110862 0.08891378 Merc 450SL          0.8979497 0.10205025 Merc 450SLC         0.9223868 0.07761324 Cadillac Fleetwood  0.9187301 0.08126994 Lincoln Continental 0.9153549 0.08464509 Chrysler Imperial   0.9358186 0.06418140 Fiat 128            0.1627969 0.83720313 Honda Civic         0.1649799 0.83502008 Toyota Corolla      0.1781531 0.82184689 Toyota Corona       0.1780519 0.82194807 Dodge Challenger    0.8427087 0.15729129 AMC Javelin         0.8496198 0.15038021 Camaro Z28          0.9190294 0.08097056 Pontiac Firebird    0.8361349 0.16386511 Fiat X1-9           0.1490934 0.85090660 Porsche 914-2       0.5797194 0.42028060 Lotus Europa        0.4169587 0.58304133 Ford Pantera L      0.8731716 0.12682843 Ferrari Dino        0.8392372 0.16076281 Maserati Bora       0.8519422 0.14805785 Volvo 142E          0.2289231 0.77107694 
like image 57
Ian Fellows Avatar answered Oct 02 '22 18:10

Ian Fellows