Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Corpora/stopwords not found when import nltk library

Tags:

python

nltk

I trying to import the nltk package in python 2.7

  import nltk   stopwords = nltk.corpus.stopwords.words('english')   print(stopwords[:10]) 

Running this gives me the following error:

LookupError:  ********************************************************************** Resource 'corpora/stopwords' not found.  Please use the NLTK Downloader to obtain the resource:  >>> nltk.download() 

So therefore I open my python termin and did the following:

import nltk   nltk.download() 

Which gives me:

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml 

However this does not seem to stop. And running it again still gives me the same error. Any thoughts where this goes wrong?

like image 694
Frits Verstraten Avatar asked Jan 12 '17 10:01

Frits Verstraten


2 Answers

You are currently trying to download every item in nltk data, so this can take long. You can try downloading only the stopwords that you need:

import nltk nltk.download('stopwords') 

Or from command line (thanks to Rafael Valero's answer):

python -m nltk.downloader stopwords 

Reference:

  • Installing NLTK Data - Command line installation
like image 128
Kurt Bourbaki Avatar answered Sep 21 '22 20:09

Kurt Bourbaki


The some as mentioned here by Kurt Bourbaki but in the command line:

python -m nltk.downloader stopwords 
like image 31
Rafael Valero Avatar answered Sep 21 '22 20:09

Rafael Valero