spaCy needs a file that is not there: strings.json

Tags:

I am running pytextrank were in its second stage, I get this error from spaCy:

File "C:\Anaconda3\lib\pathlib.py", line 371, in wrapped return strfunc(str(pathobj), *args)

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Anaconda3\\lib\\site-packages\\spacy\\data\\en\\vocab\\strings.json'

I looked for strings.json but there is no such thing.

The interesting thing is that similar error with pathlib.py existed when I installed spaCy with the following error code:

OSError: Symbolic link privilege not held

Do you guys have any idea ? Thanks

610

asked Mar 26 '17 02:03

Peter

2 Answers

Finallly, I can answer question in stackoverflow. I occurred same problem but solved it eventually. Here is my suggestion:

1. Download spaCy model from python -m spacy or github

both way are very convenient.

1). from python spacy:

python3 -m spacy download en

assume you are using python3+, the can be done automatically and generate new packages of model, which you can import via import en or using spacy.load('en')

2). from github

transfer link, selet the newest version and download it.

2. (if you not using python -m way then you want manually link the model) Link your downloaded model

this is the most important part, you must unzip your downloaded tar or gzip file, and get a folder, however this is still not the link path you want.

.
├── en_core_web_md-1.2.1
│   ├── deps
│   │   ├── config.json
│   │   └── model
│   ├── meta.json
│   ├── ner
│   │   ├── config.json
│   │   └── model
│   ├── pos
│   │   ├── config.json
│   │   └── model
│   └── vocab
│       ├── gazetteer.json
│       ├── lexemes.bin
│       ├── oov_prob
│       ├── serializer.json
│       ├── strings.json
│       └── vec.bin

you must link the folder with the structure. which spacy will link the folder via your link-shortcut name.

here is the link script you need:

base_path=`pwd`
sudo python3 -m spacy link ${base_path}/en_core_web_md-1.2.1 en_core_web --force

you can create a .sh file just alongside that folder and run it.

that's it!

answered Oct 21 '22 17:10

Nicholas Jela

The Symbolic link privilege not held error usually occurs when you've installed spaCy and the models into a system directory, but your user does not have the required permissions to create symbolic links. To solve this, either run download or link again as administrator or, if that's not possible, use a virtualenv to install everything into a user directory instead (for more info on this, see the troubleshooting docs).

As of v1.7.0, spaCy creates symlinks aka. shortcut links in the spacy/data directory. This makes it easier to store your models wherever you want, install them as Python packages and load them using custom names, e.g. spacy.load('my_model').

What likely happened in your case is that spaCy failed to set up this link because of the permissions error, and now can't find and load the model – including vocab/strings.json. (The way spaCy failed here is unideal, though – this has since been fixed in v1.7.3.)

Since the model is already installed, all you'd have to do is create a new symlink for it (either as admin, or in a virtualenv):

python -m spacy link en_core_web_sm en

(If you've downloaded a different model, simply replace en_core_web_sm with the name of that model. en is the shortcut to use and can be any name you want.)

Edit: In case you only want to use the tokenizer and don't care about the models, or want to use one of the supported languages that don't yet come with a statistical model, you can also just import the Language class in v1.7.3:

from spacy.fr import French
nlp = French()

answered Oct 21 '22 17:10

Ines Montani

Related questions
                            
                                Replace NaN values of pandas.DataFrame with values from list
                            
                                ModuleNotFoundError: No module named 'Ipython'
                            
                                How to extract paches from 3D image in python?
                            
                                Activate virtual environement and start jupyter notebook all in batch file
                            
                                Use definition order of Enum as natural order
                            
                                Error on ansible playbook: the python mysqldb module is required
                            
                                Tensorflow: Replacement for tf.nn.rnn_cell._linear(input, size, 0, scope)
                            
                                Subset pandas dataframe using values from two columns
                            
                                XGBoost plot importance has no property max_num_features
                            
                                return item with maximum sort-key in dynamodb
                            
                                Display Django form fields on the "same line"
                            
                                Python-PPTX: Changing table style or adding borders to cells
                            
                                Pandas group hourly data into daily sums with date index
                            
                                Python Pillow - ValueError: Decompressed Data Too Large
                            
                                Access superclass' property setter in subclass
                            
                                Grandchild inheriting from Parent class - Python
                            
                                How to sort edges in networkx based on their weight
                            
                                Django JWT auth: How to get user data?
                            
                                How to delete numpy nan from a list of strings in Python?
                            
                                Scrapy: How to output items in a specific json format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

spaCy needs a file that is not there: strings.json

Tags:

python

spacy

pytextrank