I am trying to run nltk on a SUSE Linux box which cannot be connected to the internet.
I have successfully installed nltk and it runs but when I submit
>>> tagged = nltk.pos_tag(tokens)
I get this error:
LookupError:
**********************************************************************
Resource 'tokenizers/punkt/english.pickle' not found. Please use the NLTK Downloader to obtain the resource:
I cannot use the downloader since I can't connect the box to the internet.
Does anyone how I can install the necessary packages?
Data is downloaded to the nltk_data
directory. Where that is differs from one system to another, but you can find out by doing the following:
import nltk
print nltk.data.find('.')
english.pickle
should be in a subfolder of <nltk_data>/taggers/
. The easiest way to put it there is to use the downloader on a machine that has internet access, then copy it over and put it in the same subfolder. There's only one version of english.pickle
, and you can download it on a Windows box, no problem.
The downloader stores the files in a particular folder. I imagine it's possible to download on an online machine and copy the files to the equivalent location on your offline machine. On my machine, it downloads to /usr/local/lib/nltk_data
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With