I am running pytextrank were in its second stage, I get this error from spaCy:
File "C:\Anaconda3\lib\pathlib.py", line 371, in wrapped return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Anaconda3\\lib\\site-packages\\spacy\\data\\en\\vocab\\strings.json'
I looked for strings.json but there is no such thing.
The interesting thing is that similar error with pathlib.py existed when I installed spaCy with the following error code:
OSError: Symbolic link privilege not held
Do you guys have any idea ? Thanks
To load a pipeline from a data directory, you can use spacy. load() with the local path. This will look for a config. cfg in the directory and use the lang and pipeline settings to initialize a Language class with a processing pipeline and load in the model data.
Typically, the extension for these binary files is . spacy , and they are used as input format for specifying a training corpus and for spaCy's CLI train command. The built-in convert command helps you convert spaCy's previous JSON format to the new binary format.
blank function. Create a blank pipeline of a given language class. This function is the twin of spacy. load() .
Finallly, I can answer question in stackoverflow. I occurred same problem but solved it eventually. Here is my suggestion:
both way are very convenient.
1). from python spacy:
python3 -m spacy download en
assume you are using python3+, the can be done automatically and generate new packages of model, which you can import via import en or using spacy.load('en')
2). from github
transfer link, selet the newest version and download it.
this is the most important part, you must unzip your downloaded tar or gzip file, and get a folder, however this is still not the link path you want.
.
├── en_core_web_md-1.2.1
│ ├── deps
│ │ ├── config.json
│ │ └── model
│ ├── meta.json
│ ├── ner
│ │ ├── config.json
│ │ └── model
│ ├── pos
│ │ ├── config.json
│ │ └── model
│ └── vocab
│ ├── gazetteer.json
│ ├── lexemes.bin
│ ├── oov_prob
│ ├── serializer.json
│ ├── strings.json
│ └── vec.bin
you must link the folder with the structure. which spacy will link the folder via your link-shortcut name.
here is the link script you need:
base_path=`pwd`
sudo python3 -m spacy link ${base_path}/en_core_web_md-1.2.1 en_core_web --force
you can create a .sh file just alongside that folder and run it.
that's it!
The Symbolic link privilege not held
error usually occurs when you've installed spaCy and the models into a system directory, but your user does not have the required permissions to create symbolic links. To solve this, either run download
or link
again as administrator or, if that's not possible, use a virtualenv
to install everything into a user directory instead (for more info on this, see the troubleshooting docs).
As of v1.7.0, spaCy creates symlinks aka. shortcut links in the spacy/data
directory. This makes it easier to store your models wherever you want, install them as Python packages and load them using custom names, e.g. spacy.load('my_model')
.
What likely happened in your case is that spaCy failed to set up this link because of the permissions error, and now can't find and load the model – including vocab/strings.json
. (The way spaCy failed here is unideal, though – this has since been fixed in v1.7.3.)
Since the model is already installed, all you'd have to do is create a new symlink for it (either as admin, or in a virtualenv
):
python -m spacy link en_core_web_sm en
(If you've downloaded a different model, simply replace en_core_web_sm
with the name of that model. en
is the shortcut to use and can be any name you want.)
Edit: In case you only want to use the tokenizer and don't care about the models, or want to use one of the supported languages that don't yet come with a statistical model, you can also just import the Language
class in v1.7.3:
from spacy.fr import French
nlp = French()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With