Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spaCy 2.0: Save and Load a Custom NER model

Tags:

python

nlp

spacy

I've trained a custom NER model in spaCy with a custom tokenizer. I'd like to save the NER model without the tokenizer. I tried the following code with I found in the spaCy support forum:

import spacy

nlp = spacy.load("en")
nlp.tokenizer = some_custom_tokenizer
# Train the NER model...
nlp.tokenizer = None
nlp.to_disk('/tmp/my_model', disable=['tokenizer'])

When I try to load it, the pipeline is empty, and surprisingly, is has the default spaCy tokenizer.

nlp = spacy.blank('en').from_disk('/tmp/model', disable=['tokenizer'])

Any idea how can I load the model without the tokenizer, but get the full pipeline? thanks

like image 713
Gino Avatar asked Jan 30 '18 15:01

Gino


1 Answers

You can use nlp = spacy.load('/tmp/model') to load your model after you saved it to disk. Doing what you did apparently only loads the binary data according to the Spacy documentation (https://spacy.io/usage/training#section-saving-loading)

like image 71
Valentin Calomme Avatar answered Nov 03 '22 16:11

Valentin Calomme