I've trained a custom NER model in spaCy with a custom tokenizer. I'd like to save the NER model without the tokenizer. I tried the following code with I found in the spaCy support forum:
import spacy
nlp = spacy.load("en")
nlp.tokenizer = some_custom_tokenizer
# Train the NER model...
nlp.tokenizer = None
nlp.to_disk('/tmp/my_model', disable=['tokenizer'])
When I try to load it, the pipeline is empty, and surprisingly, is has the default spaCy tokenizer.
nlp = spacy.blank('en').from_disk('/tmp/model', disable=['tokenizer'])
Any idea how can I load the model without the tokenizer, but get the full pipeline? thanks
You can use nlp = spacy.load('/tmp/model')
to load your model after you saved it to disk. Doing what you did apparently only loads the binary data according to the Spacy documentation (https://spacy.io/usage/training#section-saving-loading)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With