Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directly load spacy model from packaged tar.gz file

Is it possible to load a packaged spacy model (i.e. foo.tar.gz) directly from the tar file instead of installing it beforehand? I would imagine something like:

import spacy 

nlp = spacy.load(/some/path/foo.tar.gz)
like image 412
evermean Avatar asked Mar 14 '18 09:03

evermean


People also ask

What does spaCy load (' en ') do?

Essentially, spacy. load() is a convenience wrapper that reads the pipeline's config. cfg , uses the language and pipeline information to construct a Language object, loads in the model data and weights, and returns it.

How do I manually download a spaCy model?

To download and install the models manually, unpack the archive, drop the contained directory into spacy/data and load the model via spacy. load('en') or spacy. load('de') .

Why spaCy load is not working?

This error means that the spaCy module can't be located on your system, or in your environment. Make sure you have spaCy installed. If you're using a virtual environment, make sure it's activated and check that spaCy is installed in that environment – otherwise, you're trying to load a system installation.


2 Answers

No, that's currently not possible. The main purpose of the .tar.gz archives is to make them easy to install via pip install. However, you can always extract the model data from the archive, and then load it in from a path – see here for more details.

nlp = spacy.load('/path/to/en_core_web_md')

Using the spacy link command you can also create "shortcut links" for your models, i.e. symlinks that let you load in models using a custom name instead of the full path or package name. This is especially useful if you're working with large models and multiple environments (and don't want to install the data in each of them).

python -m spacy link /path/to/model_data cool_model

The above shortcut link would then let you load your model like this:

nlp = spacy.load('cool_model')

Alternatively, if you really need to load models from an archive, you could always write a simple wrapper for spacy.load that takes the file, extracts the contents, reads the model meta, gets the path to the data directory and then calls spacy.util.load_model_from_path on it and returns the nlp object.

like image 70
Ines Montani Avatar answered Oct 14 '22 16:10

Ines Montani


Its not the direct answer but it might be helpful in order to load compressed models directly with SpaCy. This can be done by using pickle.

First, you need to load your SpaCy Model and dump it compressed with pickle:

import spacy
import pickle

s = spacy.load("en_core_web_sm", parse=False)

pickle.dump(s, open("save.p", "wb"))

Afterwards, you can load easily somewhere else the pickle dump directly as SpaCy model:

s = pickle.load(open("save.p", "rb"))
like image 41
Rene B. Avatar answered Oct 14 '22 15:10

Rene B.