Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spaCy and spaCy models in setup.py

In my project I have spaCy as a dependency in my setup.py, but I want to add also a default model.

My attempt so far has been:

install_requires=['spacy', 'en_core_web_sm'],
dependency_links=['https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm'],

inside my setup.py, but both a regular pip install of my package and a pip install --process-dependency-links return:

pip._internal.exceptions.DistributionNotFound: No matching distribution found for en_core_web_sm (from mypackage==0.1)

I found this github issue from AllenAI with the same problem and no solution.

Note that if I pip install the url of the model directly, it works fine, but I want to install it as a dependency when my package is install with pip install.

like image 508
w4nderlust Avatar asked Nov 19 '18 22:11

w4nderlust


People also ask

What models does spaCy use?

EDIT Feb 2021: spaCy version 3 now uses the Transformer architecture as its deep learning model.

Where is spaCy models stored?

You can place the model data directory anywhere on your local file system. To use it with spaCy, simply assign it a name by creating a shortcut link for the data directory.

How do I know what model spaCy I have?

You can also do python -m spacy info . If you're updating an existing installation, you might want to run python -m spacy validate , to check that the models you already have are compatible with the version you just installed.


1 Answers

You can use pip's recent support for PEP 508 URL requirements:

install_requires=[
    'spacy',
    'en_core_web_sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz',
],

Note that this requires you to build your project with up-to-date versions of setuptools and wheel (at least v0.32.0 for wheel; not sure about setuptools), and your users will only be able to install your project if they're using at least version 18.1 of pip.

More importantly, though, this is not a viable solution if you intend to distribute your package on PyPI; quoting pip's release notes:

As a security measure, pip will raise an exception when installing packages from PyPI if those packages depend on packages not also hosted on PyPI. In the future, PyPI will block uploading packages with such external URL dependencies directly.

like image 187
jwodder Avatar answered Sep 28 '22 01:09

jwodder