Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"ImportError: cannot import name StanfordNERTagger" in NLTK

Tags:

python

nlp

nltk

I'm unable to import the NER Stanford Tagger in NLTK. This is what I have done:

Downloaded the java code from here and added an environment variable STANFORD_MODELS with the path to the folder where the java code is stored.

That should be sufficient according to the information that is provided on the NLTK site. It says:

"Tagger models need to be downloaded from http://nlp.stanford.edu/software and the STANFORD_MODELS environment variable set (a colon-separated list of paths)."

Would anybody be kind enough to help me, please?

EDIT: The downloaded folder is located at /Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20 and contains these files:

LICENSE.txt         lib             ner.sh              stanford-ner-3.5.2-javadoc.jar
NERDemo.java            ner-gui.bat         sample-conll-file.txt       stanford-ner-3.5.2-sources.jar
README.txt          ner-gui.command         sample-w-time.txt       stanford-ner-3.5.2.jar
build.xml           ner-gui.sh          sample.ner.txt          stanford-ner.jar
classifiers         ner.bat             sample.txt

Then I have added an environment variable STANFORD_MODELS:

os.environ["STANFORD_MODELS"] = "/Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20"

Calling from nltk.tag import StanfordNERTagger yields the error:

ImportError                               Traceback (most recent call last)
<ipython-input-356-f4287e573edc> in <module>()
----> 1 from nltk.tag import StanfordNERTagger

ImportError: cannot import name StanfordNERTagger

Also in case that this may be relevant, this is what is in my nltk.tag folder:

__init__.py api.pyc     crf.py      hmm.pyc     senna.py    sequential.pyc  stanford.py tnt.pyc
__init__.pyc    brill.py    crf.pyc     hunpos.py   senna.pyc   simplify.py stanford.pyc    util.py
api.py      brill.pyc   hmm.py      hunpos.pyc  sequential.py   simplify.pyc    tnt.py      util.pyc

EDIT2: I have managed to import the NER Tagger, by using:

from nltk.tag.stanford import NERTagger

but now when calling an example call from the NLTK website, I get:

In [360]: st = NERTagger('english.all.3class.distsim.crf.ser.gz')
---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-360-0c0ab770b0ff> in <module>()
----> 1 st = NERTagger('english.all.3class.distsim.crf.ser.gz')

/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, *args, **kwargs)
    158 
    159     def __init__(self, *args, **kwargs):
--> 160         super(NERTagger, self).__init__(*args, **kwargs)
    161 
    162     @property

/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, path_to_model, path_to_jar, encoding, verbose, java_options)
     40                 self._JAR, path_to_jar,
     41                 searchpath=(), url=_stanford_url,
---> 42                 verbose=verbose)
     43 
     44         self._stanford_model = find_file(path_to_model,

/Library/Python/2.7/site-packages/nltk/__init__.pyc in find_jar(name, path_to_jar, env_vars, searchpath, url, verbose)
    595                     (name, url))
    596     div = '='*75
--> 597     raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
    598 
    599 ##########################################################################

LookupError: 

===========================================================================
  NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
  environment variable.

  For more information, on stanford-ner.jar, see:
    <http://nlp.stanford.edu/software>
===========================================================================

So I have incorrectly set the environment variable. Can anybody help me with that?

like image 327
eager2learn Avatar asked Sep 18 '15 13:09

eager2learn


1 Answers

I worked it out.

  1. set the STANFORD_MODELS as you did # I learnt from you, thx!
  2. import nltk.tag.stanford as st
  3. tagger = st.StanfordNERTagger(PATH_TO_GZ, PATH_TO_JAR) # here PATH_TO_GZ and PATH_TO_JAR are the FULL path to where I store the file "all.3class.distsim.crf.ser.gz" and the file "stanford-ner.jar"
  4. now the tagger is usable. # try tagger.tag(‘Rami Eid is studying at Stony Brook University in NY’.split())

It has nothing to do with CLASSPATH.

Hope it helps!

like image 135
Skywalker326 Avatar answered Nov 16 '22 00:11

Skywalker326