I installed spacy using
python3 install spacy
and downladed two English models using
python3 -m spacy download en
and
python3 -m spacy download en_core_web_sm
When I attempt to load any one of them with
import spacy
nlp = spacy.load('en')
I get
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 1792000 exceeds max_bin_len(1048576)
Googling didn't help me at all and I don't understand what the error is. I'd be thankful for any pointers.
This makes it simpler while handling documents/text of unknown length. Or you can remove some of the parts of SpaCy object pipeline which you will not need. then set nlp.max_length = len (txt) + 100 (100 is just a cushion not necessary really)
This is also how spaCy does it under the hood when loading a pipeline: it loads the config.cfg containing the language and pipeline information, initializes the language class, creates and adds the pipeline components based on the config and then loads in the binary data. You can read more about this process here.
[E088] Text of length 1029371 exceeds maximum of 1000000. The v2.x parser and NER models require roughly 1GB of temporary memory per 100,000 characters in the input. This means long texts may cause memory allocation errors.
It is saved out with a pipeline as the config.cfg. This is also how spaCy does it under the hood when loading a pipeline: it loads the config.cfg containing the language and pipeline information, initializes the language class, creates and adds the pipeline components based on the config and then loads in the binary data.
try pip install msgpack==0.5.6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With