When i try the below code
python -c "import nltk; nltk.download('punkt');
nltk.download('averaged_perceptron_tagger');
nltk.download('maxent_treebank_pos_tagger');
nltk.download('wordnet')"
the console says
[nltk_data] Error loading punkt: HTTP Error 405: Not allowed.
[nltk_data] Error loading averaged_perceptron_tagger: HTTP Error 405:
[nltk_data] Not allowed.
[nltk_data] Error loading maxent_treebank_pos_tagger: HTTP Error 405:
[nltk_data] Not allowed.
[nltk_data] Error loading wordnet: HTTP Error 405: Not allowed.
Download individual packages from https://www.nltk.org/nltk_data/ (see the “download” links). Unzip them to the appropriate subfolder. For example, the Brown Corpus, found at: https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/corpora/brown.zip is to be unzipped to nltk_data/corpora/brown .
'] punkt is the required package for tokenization. Hence you may download it using nltk download manager or download it programmatically using nltk.
This is caused by a down-age of Github raw file link.
Meanwhile a stop-gap solution would be to manually download the file:
PATH_TO_NLTK_DATA=/home/username/nltk_data/
wget https://github.com/nltk/nltk_data/archive/gh-pages.zip
unzip gh-pages.zip
mv nltk_data-gh-pages/ $PATH_TO_NLTK_DATA
We're working on finding an alternative to the data and model downloading.
Meanwhile, @everyone please help to check that your script(s) and make sure that you're not overloading the data downloads! Thank you in advance!!
Please check https://github.com/nltk/nltk/issues/1787 for latest updates on this issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With