Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to train NER and integrate it into the original model using Spacy

I am trying to train NER with my own data using Spacy. My question is how to integrate my trained NER into the original model ? so that it could be convenient to be continuously trained and used for my application. I did not find any sample.

I found some similar examples below to train NER, but it seems all of these don't save the trained model and integrate it back into Spacy. Some are hold in memory, some are save the NER model into additional folder... So how to do it in appropriate way to meet my demand ? Thank you !!!

I am using spacy 1.7.3

https://github.com/explosion/spaCy/blob/master/examples/training/train_ner.py https://github.com/explosion/spacy-dev-resources/blob/master/spacy-annotator/displacy/parse.py

like image 298
harryIT Avatar asked Apr 13 '17 09:04

harryIT


People also ask

How do you train a spaCy NER model?

First , load the pre-existing spacy model you want to use and get the ner pipeline through get_pipe() method. Next, store the name of new category / entity type in a string variable LABEL . Now, how will the model know which entities to be classified under the new label ? You will have to train the model with examples.

How does spaCy perform NER?

Text Processing using spaCy | NLP Library Named Entity Recognition NER works by locating and identifying the named entities present in unstructured text into the standard categories such as person names, locations, organizations, time expressions, quantities, monetary values, percentage, codes etc.


Video Answer


1 Answers

To provide training examples to the entity recogniser, you'll first need to create an instance of the GoldParse class. You can specify your annotations in a stand-off format or as token tags.

import spacy
import random
from spacy.gold import GoldParse
from spacy.language import EntityRecognizer

train_data = [
    ('Who is Chaka Khan?', [(7, 17, 'PERSON')]),
    ('I like London and Berlin.', [(7, 13, 'LOC'), (18, 24, 'LOC')])
]

nlp = spacy.load('en', entity=False, parser=False)
ner = EntityRecognizer(nlp.vocab, entity_types=['PERSON', 'LOC'])

for itn in range(5):
    random.shuffle(train_data)
    for raw_text, entity_offsets in train_data:
        doc = nlp.make_doc(raw_text)
        gold = GoldParse(doc, entities=entity_offsets)

        nlp.tagger(doc)
        ner.update(doc, gold)
ner.model.end_training()

to simplify this you can try this code

doc = Doc(nlp.vocab, [u'rats', u'make', u'good', u'pets'])
gold = GoldParse(doc, [u'U-ANIMAL', u'O', u'O', u'O'])
ner = EntityRecognizer(nlp.vocab, entity_types=['ANIMAL'])
ner.update(doc, gold)

https://spacy.io/docs/usage/training-ner

like image 133
Nishank Mahore Avatar answered Nov 15 '22 08:11

Nishank Mahore