Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpreting basic output from Vowpal Wabbit

Tags:

vowpalwabbit

I had a couple questions about the output from a simple run of VW. I have read around the internet and the wiki sites but am still unsure about a couple of basic things.

I ran the following on the boston housing data:

vw -d housing.vm --progress 1

where the housing.vm file is set up as (partially):

enter image description here

and output is (partially):

enter image description here

Question 1:

1) Is it correct to think about the average loss column as the following steps:

a) predict zero, so the first average loss is the squared error of the first example (with the prediction as zero)

b) build a model on example 1 and predict example 2. Average the now 2 squared losses

c) build a model on example 1-2 and predict example 3. Average the now 3 squared losses

d) ...

Do this until you hit the end of the data (assuming a single pass)

2) What is the current features columns? It appears to be the number of non-zero features + an intercept. What is shown in the example, suggests that a feature is not counted if it is zero - is that true? For instance, the second record has a value of zero for 'ZN'. Does VW really look at that numeric feature as missing??

like image 338
B_Miner Avatar asked Sep 14 '14 22:09

B_Miner


1 Answers

Your statements are basically correct. By default, VW does online learning, so in step c it takes the current model (weights) and updates it with the current example (rather than learning from all the previous examples again).

As you supposed, the current features column is the number of (non-zero) features for the current example. The intercept feature is included automatically, unless you specify --noconstant.

There is no difference between a missing feature and a feature with zero value. Both means that you won't update the corresponding weight.

like image 54
Martin Popel Avatar answered Oct 29 '22 17:10

Martin Popel