Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)

what is difference between spacy.load('en_core_web_sm') and spacy.load('en')? This link explains different model sizes. But i am still not clear how spacy.load('en_core_web_sm') and spacy.load('en') differ

spacy.load('en') runs fine for me. But the spacy.load('en_core_web_sm') throws error

i have installed spacyas below. when i go to jupyter notebook and run command nlp = spacy.load('en_core_web_sm') I get the below error

--------------------------------------------------------------------------- OSError                                   Traceback (most recent call last) <ipython-input-4-b472bef03043> in <module>()       1 # Import spaCy and load the language library       2 import spacy ----> 3 nlp = spacy.load('en_core_web_sm')       4        5 # Create a Doc object  C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\__init__.py in load(name, **overrides)      13     if depr_path not in (True, False, None):      14         deprecation_warning(Warnings.W001.format(path=depr_path)) ---> 15     return util.load_model(name, **overrides)      16       17   C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\util.py in load_model(name, **overrides)     117     elif hasattr(name, 'exists'):  # Path or Path-like to model data     118         return load_model_from_path(name, **overrides) --> 119     raise IOError(Errors.E050.format(name=name))     120      121   OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory. 

how I installed Spacy ---

(C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>conda install -c conda-forge spacy Fetching package metadata ............. Solving package specifications: .  Package plan for installation in environment C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder:  The following NEW packages will be INSTALLED:      blas:           1.0-mkl     cymem:          1.31.2-py35h6538335_0    conda-forge     dill:           0.2.8.2-py35_0           conda-forge     msgpack-numpy:  0.4.4.2-py_0             conda-forge     murmurhash:     0.28.0-py35h6538335_1000 conda-forge     plac:           0.9.6-py_1               conda-forge     preshed:        1.0.0-py35h6538335_0     conda-forge     pyreadline:     2.1-py35_1000            conda-forge     regex:          2017.11.09-py35_0        conda-forge     spacy:          2.0.12-py35h830ac7b_0    conda-forge     termcolor:      1.1.0-py_2               conda-forge     thinc:          6.10.3-py35h830ac7b_2    conda-forge     tqdm:           4.29.1-py_0              conda-forge     ujson:          1.35-py35hfa6e2cd_1001   conda-forge  The following packages will be UPDATED:      msgpack-python: 0.4.8-py35_0                         --> 0.5.6-py35he980bc4_3 conda-forge  The following packages will be DOWNGRADED:      freetype:       2.7-vc14_2               conda-forge --> 2.5.5-vc14_2  Proceed ([y]/n)? y  blas-1.0-mkl.t 100% |###############################| Time: 0:00:00   0.00  B/s cymem-1.31.2-p 100% |###############################| Time: 0:00:00   1.65 MB/s msgpack-python 100% |###############################| Time: 0:00:00   5.37 MB/s murmurhash-0.2 100% |###############################| Time: 0:00:00   1.49 MB/s plac-0.9.6-py_ 100% |###############################| Time: 0:00:00   0.00  B/s pyreadline-2.1 100% |###############################| Time: 0:00:00   4.62 MB/s regex-2017.11. 100% |###############################| Time: 0:00:00   3.31 MB/s termcolor-1.1. 100% |###############################| Time: 0:00:00 187.81 kB/s tqdm-4.29.1-py 100% |###############################| Time: 0:00:00   2.51 MB/s ujson-1.35-py3 100% |###############################| Time: 0:00:00   1.66 MB/s dill-0.2.8.2-p 100% |###############################| Time: 0:00:00   4.34 MB/s msgpack-numpy- 100% |###############################| Time: 0:00:00   0.00  B/s preshed-1.0.0- 100% |###############################| Time: 0:00:00   0.00  B/s thinc-6.10.3-p 100% |###############################| Time: 0:00:00   5.49 MB/s spacy-2.0.12-p 100% |###############################| Time: 0:00:10   7.42 MB/s  (C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>python -V Python 3.5.3 :: Anaconda custom (64-bit)  (C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>python -m spacy download en Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0   Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)     100% |################################| 37.4MB ... Installing collected packages: en-core-web-sm   Running setup.py install for en-core-web-sm ... done Successfully installed en-core-web-sm-2.0.0      Linking successful     C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\en_core_web_sm     -->     C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\data\en      You can now load the model via spacy.load('en')   (C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz> 
like image 425
user2543622 Avatar asked Jan 23 '19 19:01

user2543622


People also ask

Can't find model En_core_web_sm it doesn't seem to be a shortcut link?

It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory Error You need to just download this packages using this command: python -m spacy download en_core_web_lg and python -m spacy download en_core_web_sm And then Run this command: python -m spacy download en and my error solved.

What is En_core_web_sm spacy?

For example, en_core_web_sm is a small English pipeline trained on written web text (blogs, news, comments), that includes vocabulary, syntax and entities.


2 Answers

Initially I downloaded two en packages using following statements in anaconda prompt.

python -m spacy download en_core_web_lg python -m spacy download en_core_web_sm 

But, I kept on getting linkage error and finally running below command helped me to establish link and solved error.

python -m spacy download en 

Also make sure you to restart your runtime if working with Jupyter. -PS : If you get linkage error try giving admin previlages.

like image 114
Tarun Reddy Avatar answered Sep 29 '22 06:09

Tarun Reddy


The answer to your misunderstanding is a Unix concept, softlinks which we could say that in Windows are similar to shortcuts. Let's explain this.

When you spacy download en, spaCy tries to find the best small model that matches your spaCy distribution. The small model that I am talking about defaults to en_core_web_sm which can be found in different variations which correspond to the different spaCy versions (for example spacy, spacy-nightly have en_core_web_sm of different sizes).

When spaCy finds the best model for you, it downloads it and then links the name en to the package it downloaded, e.g. en_core_web_sm. That basically means that whenever you refer to en you will be referring to en_core_web_sm. In other words, en after linking is not a "real" package, is just a name for en_core_web_sm.

However, it doesn't work the other way. You can't refer directly to en_core_web_sm because your system doesn't know you have it installed. When you did spacy download en you basically did a pip install. So pip knows that you have a package named en installed for your python distribution, but knows nothing about the package en_core_web_sm. This package is just replacing package en when you import it, which means that package en is just a softlink to en_core_web_sm.

Of course, you can directly download en_core_web_sm, using the command: python -m spacy download en_core_web_sm, or you can even link the name en to other models as well. For example, you could do python -m spacy download en_core_web_lg and then python -m spacy link en_core_web_lg en. That would make en a name for en_core_web_lg, which is a large spaCy model for the English language.

Hope it is clear now :)

like image 24
gdaras Avatar answered Sep 29 '22 08:09

gdaras