Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get predictions and calculate accuracy for a given test set in fast ai?

Tags:

I'm trying to load a learner which was exported by learn.export() and I want to run it against a test set. I want my test set have labels so that I can measure its accuracy.

This is my code:

test_src = (TextList.from_df(df, path, cols='texts')
            .split_by_rand_pct(0.1, seed=42)
            .label_from_df(cols='recommend'))

learn_fwd = load_learner(path + '/fwd_learn_c', 
                         test=test_src) #, tfm_y=False)


pred_fwd,lbl_fwd = learn_fwd.get_preds(ds_type=DatasetType.Test,ordered=True) 
accuracy(pred_fwd, lbl_fwd)

And I got the following error, which apparently doesn't accept a labeled data set!!

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-7f52f2136d8e> in <module>
      6 
      7 learn_fwd = load_learner(path + '/fwd_learn_c', 
----> 8                          test=test_src) #, tfm_y=False)
      9 learn_bwd = load_learner(path + '/bwd_learn_c',
     10                          test=test_src) #, tfm_y=test_src)

~/miniconda3/lib/python3.7/site-packages/fastai/basic_train.py in load_learner(path, file, test, tfm_y, **db_kwargs)
    622     model = state.pop('model')
    623     src = LabelLists.load_state(path, state.pop('data'))
--> 624     if test is not None: src.add_test(test, tfm_y=tfm_y)
    625     data = src.databunch(**db_kwargs)
    626     cb_state = state.pop('cb_state')

~/miniconda3/lib/python3.7/site-packages/fastai/data_block.py in add_test(self, items, label, tfms, tfm_y)
    562         "Add test set containing `items` with an arbitrary `label`."
    563         # if no label passed, use label of first training item
--> 564         if label is None: labels = EmptyLabelList([0] * len(items))
    565         else: labels = self.valid.y.new([label] * len(items)).process()
    566         if isinstance(items, MixedItemList): items = self.valid.x.new(items.item_lists, inner_df=items.inner_df).process()

TypeError: object of type 'LabelLists' has no len()
like image 504
Ahmad Avatar asked Jul 13 '20 07:07

Ahmad


1 Answers

It seems that for the test set, it just accepts an ItemList (without lables). In the above example, I passed a LabelList to it which is the source of error. Anyway to get the accuracy for a test set I found the following solution:

# Create your test set:
data_test = (TextList.from_df(df, path, cols='texts')
            .split_by_rand_pct(0.1, seed=42)
            .label_from_df(cols='recommend'))

data_test.valid = data_test.train
data_test=data_test.databunch()

# Set the validation set of the learner by the test data you created
learn.data.valid_dl = data_test.valid_dl

# Now y refers to the actual labels in the data set
preds, y = learn.get_preds(ds_type=DatasetType.Valid)
acc = accuracy(preds, y)

# Alternatively you can call validate if you don't want the predictions

acc = learn.validate()[1]
like image 83
Ahmad Avatar answered Sep 30 '22 17:09

Ahmad