Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

install issue with python - spacy package in anaconda environment

I'm attempting to follow this tutorial to install the natural language processing package spaCy into a python 3 anaconda environment, windows 8

I opened console, cd-ed to my site-packages folder, activated environment, pip-ed for install, everything seemed fine except I couldn't run the second command here

$ pip install spacy
$ python -m spacy.en.download

Now I can successfully load the package but when I run the second line below, I get the following error

>>> from spacy.en import English   #this works
>>> nlp = English()                #this doesn't


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\garrett\Anaconda\envs\py3k\lib\site-packages\spacy\en\__init__.py", line 64, in __init__
    get_lex_props=get_lex_props)
  File "spacy/vocab.pyx", line 42, in spacy.vocab.Vocab.__init__ (spacy/vocab.cpp:2216)
OSError: Directory C:\Users\garrett\Anaconda\envs\py3k\lib\site-packages\spacy\en\data\vocab not found -- cannot load Vocab.

I think that it is due to the fact that I couldn't run python -m spacy.en.download

Can anyone give me an idea of what python -m spacy.en.download is supposed to be doing?

Can anyone provide a walkthrough for how to get spaCy installed in an anaconda environment?

here's the error I get after setting the directory, activating python env, running command. The first several times I tried, my spyder editor went unresponsive and I killed the console, the most recent time I got this error

$ cd C:\Users\garrett\Anaconda\envs\py3k\Lib\site-packages
$ C:\Users\garrett\Anaconda\envs\py3k\Lib\site-packages>activate py3k
$ [py3k] C:\Users\garrett\Anaconda\envs\py3k\Lib\site-packages>python -m spacy.en.download

Moving existing dir C:\Users\garrett\Anaconda\envs\py3k\Lib\site-packages\spacy\en\data to /tmp
Traceback (most recent call last):
  File "C:\Users\garrett\Anaconda\envs\py3k\lib\runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Users\garrett\Anaconda\envs\py3k\lib\runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File ".\spacy\en\download.py", line 56, in <module>
    plac.call(main)
  File ".\plac_core.py", line 309, in call
    cmd, result = parser_from(obj).consume(arglist)
  File ".\plac_core.py", line 195, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File ".\spacy\en\download.py", line 51, in main
    shutil.move(DEST_DIR, '/tmp')
  File "C:\Users\garrett\Anaconda\envs\py3k\lib\shutil.py", line 521, in move
    raise Error("Destination path '%s' already exists" % real_dst)
shutil.Error: Destination path '/tmp\data' already exists

appreciate any help or advice you can provide

like image 363
ghonke Avatar asked Nov 10 '22 19:11

ghonke


1 Answers

You have hit this bug which should be already fixed in the last version. Apparently spacy can't download the data because the destination already exists (may be from a previous interrupted download). A workaround would be to delete the /temp/data folder and retry the download.

like image 199
elyase Avatar answered Nov 15 '22 09:11

elyase