I'm unable to import the NER Stanford Tagger in NLTK. This is what I have done:
Downloaded the java code from here
and added an environment variable STANFORD_MODELS
with the path to the folder where the java code is stored.
That should be sufficient according to the information that is provided on the NLTK site. It says:
"Tagger models need to be downloaded from http://nlp.stanford.edu/software and the STANFORD_MODELS environment variable set (a colon-separated list of paths)."
Would anybody be kind enough to help me, please?
EDIT: The downloaded folder is located at /Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20 and contains these files:
LICENSE.txt lib ner.sh stanford-ner-3.5.2-javadoc.jar
NERDemo.java ner-gui.bat sample-conll-file.txt stanford-ner-3.5.2-sources.jar
README.txt ner-gui.command sample-w-time.txt stanford-ner-3.5.2.jar
build.xml ner-gui.sh sample.ner.txt stanford-ner.jar
classifiers ner.bat sample.txt
Then I have added an environment variable STANFORD_MODELS:
os.environ["STANFORD_MODELS"] = "/Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20"
Calling from nltk.tag import StanfordNERTagger yields the error:
ImportError Traceback (most recent call last)
<ipython-input-356-f4287e573edc> in <module>()
----> 1 from nltk.tag import StanfordNERTagger
ImportError: cannot import name StanfordNERTagger
Also in case that this may be relevant, this is what is in my nltk.tag folder:
__init__.py api.pyc crf.py hmm.pyc senna.py sequential.pyc stanford.py tnt.pyc
__init__.pyc brill.py crf.pyc hunpos.py senna.pyc simplify.py stanford.pyc util.py
api.py brill.pyc hmm.py hunpos.pyc sequential.py simplify.pyc tnt.py util.pyc
EDIT2: I have managed to import the NER Tagger, by using:
from nltk.tag.stanford import NERTagger
but now when calling an example call from the NLTK website, I get:
In [360]: st = NERTagger('english.all.3class.distsim.crf.ser.gz')
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
<ipython-input-360-0c0ab770b0ff> in <module>()
----> 1 st = NERTagger('english.all.3class.distsim.crf.ser.gz')
/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, *args, **kwargs)
158
159 def __init__(self, *args, **kwargs):
--> 160 super(NERTagger, self).__init__(*args, **kwargs)
161
162 @property
/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, path_to_model, path_to_jar, encoding, verbose, java_options)
40 self._JAR, path_to_jar,
41 searchpath=(), url=_stanford_url,
---> 42 verbose=verbose)
43
44 self._stanford_model = find_file(path_to_model,
/Library/Python/2.7/site-packages/nltk/__init__.pyc in find_jar(name, path_to_jar, env_vars, searchpath, url, verbose)
595 (name, url))
596 div = '='*75
--> 597 raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
598
599 ##########################################################################
LookupError:
===========================================================================
NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
environment variable.
For more information, on stanford-ner.jar, see:
<http://nlp.stanford.edu/software>
===========================================================================
So I have incorrectly set the environment variable. Can anybody help me with that?
I worked it out.
It has nothing to do with CLASSPATH.
Hope it helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With