Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ordinary Least Squares Regression in Vowpal Wabbit

Has anyone managed to run an ordinary least squares regression in Vowpal Wabbit? I'm trying to confirm that it will return the same answer as the exact solution, i.e. when choosing a to minimize ||y - X a||_2 + ||Ra||_2 (where R is the regularization) I want to get the analytic answer a = (X^T X + R^T R)^(-1) X^T y. Doing this type of regression takes about 5 lines in numpy python.

The documentation of VW suggests that it can do this (presumably the "squared" loss function) but so far I've been unable to get it to come even close to matching the python results. Becuase squared is the default loss function, I'm simply calling:

$ vw-varinfo input.txt

where input.txt has lines like

1.4 | 0:3.4 1:-1.2 2:4.0  .... etc

Do I need some other parameters in the VW call? I'm unable to grok the (rather minimal) documentation.

like image 417
andyInCambridge Avatar asked Oct 04 '13 17:10

andyInCambridge


1 Answers

I think you should use this syntax (vowpal wabbit version 7.3.1):

vw -d input.txt -f linear_model -c --passes 50 --holdout_off --loss_function squared --invert_hash model_readable.txt

This syntax will instruct VW to read your input.txt file, write on disk a model record and a cache (necessary for multi-pass convergence) and fit a regression using the squared loss function. Moreover it will finally write the model coefficients in a readable fashion into a file called model_readable.txt.

The --holdout_off option is a recent additional one in order to suppress the out-of-sample automatic loss computation (if you are using an earlier version you have to remove it).

Basically a regression analysis based on stochastic gradient descent will provide you with a vector of coefficients similar to the exact solution only when no regularization is applied and when the number of passes is high (I would suggest 50 or even more, also randomly shuffling the input file rows would help the algorithm to converge better).

like image 163
Luca Massaron Avatar answered Dec 04 '22 20:12

Luca Massaron