Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Spacy's Scorer Output

I'm evaluating a custom NER model that I built using Spacy. I'm evaluating the training sets using Spacy's Scorer class.

    def Eval(examples):
    # test the saved model
    print("Loading from", './model6/')
    ner_model = spacy.load('./model6/')

    scorer = Scorer()
    try:
        for input_, annot in examples:
            doc_gold_text = ner_model.make_doc(input_)
            gold = GoldParse(doc_gold_text, entities=annot['entities'])
            pred_value = ner_model(input_)
            scorer.score(pred_value, gold)
    except Exception as e: print(e)

    print(scorer.scores)

It works fine but I don't understand the output. Here's what I get for each training set.

{'uas': 0.0, 'las': 0.0, 'ents_p': 90.14084507042254, 'ents_r': 92.7536231884058, 'ents_f': 91.42857142857143, 'tags_acc': 0.0, 'token_acc': 100.0}

{'uas': 0.0, 'las': 0.0, 'ents_p': 91.12227805695142, 'ents_r': 93.47079037800687, 'ents_f': 92.28159457167091, 'tags_acc': 0.0, 'token_acc': 100.0}

{'uas': 0.0, 'las': 0.0, 'ents_p': 92.45614035087719, 'ents_r': 92.9453262786596, 'ents_f': 92.70008795074759, 'tags_acc': 0.0, 'token_acc': 100.0}

{'uas': 0.0, 'las': 0.0, 'ents_p': 94.5993031358885, 'ents_r': 94.93006993006993, 'ents_f': 94.76439790575917, 'tags_acc': 0.0, 'token_acc': 100.0}

{'uas': 0.0, 'las': 0.0, 'ents_p': 92.07920792079209, 'ents_r': 93.15525876460768, 'ents_f': 92.61410788381743, 'tags_acc': 0.0, 'token_acc': 100.0}

Does anyone know what the keys are? I've looked over Spacy's documentation and could not find anything.

Thanks!

like image 871
Evan Lalo Avatar asked Jun 01 '18 13:06

Evan Lalo


People also ask

What is Ents_f?

ENTS_F — f-score from 0 to 100. ENTS_P — precision score from 0 to 100. ENTS_R — recall score form 0 to 100. SCORE — evaluation score from 0.0 to 1.0 in two decimal place (rounded). It is based on the training.

How do you evaluate a spaCy model?

As name implies, this command will evaluate a model accuracy and speed. It will be done on JSON'-formatted annotated data. Evaluate command will print the results and optionally export displaCy visualisations of a sample set of parsers to HTML files (.

How do you train spaCy to auto detect new entities?

First , load the pre-existing spacy model you want to use and get the ner pipeline through get_pipe() method. Next, store the name of new category / entity type in a string variable LABEL . Now, how will the model know which entities to be classified under the new label ? You will have to train the model with examples.

What was spaCy trained on?

spaCy (/speɪˈsiː/ spay-SEE) is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.


1 Answers

  • UAS (Unlabelled Attachment Score) and LAS (Labelled Attachment Score) are standard metrics to evaluate dependency parsing. UAS is the proportion of tokens whose head has been correctly assigned, LAS is the proportion of tokens whose head has been correctly assigned with the right dependency label (subject, object, etc).
  • ents_p, ents_r, ents_f are the precision, recall and fscore for the NER task.
  • tags_acc is the POS tagging accuracy.
  • token_acc seems to be the precision for token segmentation.
like image 124
mcoav Avatar answered Sep 27 '22 20:09

mcoav