I'm trying to get a vowpal wabbit model saved with inverted hashes. I have a valid model produced with the following:
vw --oaa 2 -b 24 -d mydata.vw --readable_model mymodel.readable
which produces a model file like this:
Version 7.7.0
Min label:-1.000000
Max label:1.000000
bits:24
0 pairs:
0 triples:
rank:0
lda:0
0 ngram:
0 skip:
options: --oaa 2
:0
66:0.016244
67:-0.016241
80:0.026017
81:-0.026020
84:0.015005
85:-0.015007
104:-0.053924
105:0.053905
112:-0.015402
113:0.015412
122:-0.025704
123:0.025704
...
(and so on for many thousands more features). However, to be more useful, I need to see the feature names. Seemed like a fairly obvious thing, but I did
vw --oaa 2 -b 24 -d mydata.vw --invert_hash mymodel.inverted
and it produced a model file like this (no weights are produced):
Version 7.7.0
Min label:-1.000000
Max label:1.000000
bits:24
0 pairs:
0 triples:
rank:0
lda:0
0 ngram:
0 skip:
options: --oaa 2
:0
It feels like I've obviously done something wrong, but I think I'm using the options in the documented way:
--invert_hash
is similar to--readable_model
, but the model is output in a more human readable format with feature names followed by weights, instead of hash indexes and weights.
Does anyone see why my second command is failing to produce any output?
This is caused by a bug in VW which was fixed recently (on account of this question), see https://github.com/JohnLangford/vowpal_wabbit/issues/337.
By the way, it does not make sense to use --oaa 2
. If you want binary classification (aka logistic regression), use --loss_function=logistic
(and make sure your labels are 1 and -1).
OAA makes sense only for N>2 number of classes (and it is recommended to use --loss_function=logistic
with --oaa
).
Also note that learning with --invert_hash
is much slower (and requires more memory, of course). The recommended way how to create inverted-hash model, especially with multiple passes, is to learn a usual binary model and then convert it to inverted hash using one pass over the training data with -t
:
vw -d mytrain.data -c --passes 4 -oaa 3 -f model.binary
vw -d mytrain.data -t -i model.binary --invert_hash model.humanreadable
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With