Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vowpal Wabbit Logistic Regression

I am performing logistic regression using Vowpal Wabbit on a dataset with 25 features and 48 million instances. I have a question on current predict values. Should it be within 0 or 1.

average    since         example     example  current  current  current
loss       last          counter      weight    label  predict features
0.693147   0.693147            1         1.0  -1.0000   0.0000       24
0.419189   0.145231            2         2.0  -1.0000  -1.8559       24
0.235457   0.051725            4         4.0  -1.0000  -2.7588       23
6.371911   12.508365           8         8.0  -1.0000  -3.7784       24
3.485084   0.598258           16        16.0  -1.0000  -2.2767       24
1.765249   0.045413           32        32.0  -1.0000  -2.8924       24
1.017911   0.270573           64        64.0  -1.0000  -3.0438       25
0.611419   0.204927          128       128.0  -1.0000  -3.1539       25
0.469127   0.326834          256       256.0  -1.0000  -1.6101       23
0.403473   0.337820          512       512.0  -1.0000  -2.8843       25
0.337348   0.271222         1024      1024.0  -1.0000  -2.5209       25
0.328909   0.320471         2048      2048.0  -1.0000  -2.0732       25
0.309401   0.289892         4096      4096.0  -1.0000  -2.7639       25
0.291447   0.273492         8192      8192.0  -1.0000  -2.5978       24
0.287428   0.283409        16384     16384.0  -1.0000  -3.1774       25
0.287249   0.287071        32768     32768.0  -1.0000  -2.7770       24
0.282737   0.278224        65536     65536.0  -1.0000  -1.9070       25
0.278517   0.274297       131072    131072.0  -1.0000  -3.3813       24
0.291475   0.304433       262144    262144.0   1.0000  -2.7975       23
0.324553   0.357630       524288    524288.0  -1.0000  -0.8995       24
0.373086   0.421619      1048576   1048576.0  -1.0000  -1.2076       24
0.422605   0.472125      2097152   2097152.0   1.0000  -1.4907       25
0.476046   0.529488      4194304   4194304.0  -1.0000  -1.8591       25
0.476627   0.477208      8388608   8388608.0  -1.0000  -2.0037       23
0.446556   0.416485     16777216  16777216.0  -1.0000  -0.9915       24
0.422831   0.399107     33554432  33554432.0  -1.0000  -1.9549       25
0.428316   0.433801     67108864  67108864.0  -1.0000  -0.6376       24
0.425511   0.422705    134217728 134217728.0  -1.0000  -0.4094       24
0.425185   0.424860    268435456 268435456.0  -1.0000  -1.1529       24
0.426747   0.428309    536870912 536870912.0  -1.0000  -2.7468       25
like image 544
user1586694 Avatar asked Nov 09 '14 22:11

user1586694


1 Answers

Predictions are in the range [-50, +50] (theoretically any real number, but Vowpal Wabbit truncates it to [-50, +50]).

To convert them to {-1, +1}, use --binary. Positive predictions are simply mapped to +1, negative to -1.

To convert them to [0, +1], use --link=logistic. This uses the logistic function 1/(1 + exp(-x)). You should also use --loss_function=logistic if you want to interpret the numbers as probabilities.

To convert them to [-1, +1], use --link=glf1. This uses formula 2/(1 + exp(-x)) - 1 (generalized logistic function with limits of 1).

like image 74
Martin Popel Avatar answered Oct 19 '22 03:10

Martin Popel