Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nltk StanfordNERTagger : NoClassDefFoundError: org/slf4j/LoggerFactory (In Windows)

NOTE: I am using Python 2.7 as part of Anaconda distribution. I hope this is not a problem for nltk 3.1.

I am trying to use nltk for NER as

import nltk
from nltk.tag.stanford import StanfordNERTagger 
#st = StanfordNERTagger('stanford-ner/all.3class.distsim.crf.ser.gz', 'stanford-ner/stanford-ner.jar')
st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') 
print st.tag(str)

but i get

Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
    at edu.stanford.nlp.io.IOUtils.<clinit>(IOUtils.java:41)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1117)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1076)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1057)
    at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:3088)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 5 more

Traceback (most recent call last):
  File "X:\jnk.py", line 47, in <module>
    print st.tag(str)
  File "X:\Anaconda2\lib\site-packages\nltk\tag\stanford.py", line 66, in tag
    return sum(self.tag_sents([tokens]), []) 
  File "X:\Anaconda2\lib\site-packages\nltk\tag\stanford.py", line 89, in tag_sents
    stdout=PIPE, stderr=PIPE)
  File "X:\Anaconda2\lib\site-packages\nltk\internals.py", line 134, in java
    raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : ['X:\\PROGRA~1\\Java\\JDK18~1.0_6\\bin\\java.exe', '-mx1000m', '-cp', 'X:\\stanford\\stanford-ner.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-loadClassifier', 'X:\\stanford\\classifiers\\english.all.3class.distsim.crf.ser.gz', '-textFile', 'x:\\appdata\\local\\temp\\tmpqjsoma', '-outputFormat', 'slashTags', '-tokenizerFactory', 'edu.stanford.nlp.process.WhitespaceTokenizer', '-tokenizerOptions', '"tokenizeNLs=false"', '-encoding', 'utf8']

but i can see that the slf4j jar is there in my lib folder. do i need to update an environment variable?

Edit

Thanks everyone for their help, but i still get the same error. Here is what i tried recently

import nltk
from nltk.tag import StanfordNERTagger 
print(nltk.__version__)
stanford_ner_dir = 'X:\\stanford\\'
eng_model_filename= stanford_ner_dir + 'classifiers\\english.all.3class.distsim.crf.ser.gz'
my_path_to_jar= stanford_ner_dir + 'stanford-ner.jar'
st = StanfordNERTagger(model_filename=eng_model_filename, path_to_jar=my_path_to_jar) 
print st._stanford_model
print st._stanford_jar

st.tag('Rami Eid is studying at Stony Brook University in NY'.split())

and also

import nltk
from nltk.tag import StanfordNERTagger 
print(nltk.__version__)
st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') 
print st._stanford_model
print st._stanford_jar
st.tag('Rami Eid is studying at Stony Brook University in NY'.split())

i get

3.1
X:\stanford\classifiers\english.all.3class.distsim.crf.ser.gz
X:\stanford\stanford-ner.jar

after that it goes on to print the same stacktrace as before. java.lang.ClassNotFoundException: org.slf4j.LoggerFactory

any idea why this might be happening? I updated my CLASSPATH as well. I even added all the relevant folders to my PATH environment variable.for example the folder where i unzipped the stanford jars, the place where i unzipped slf4j and even the lib folder inside the stanford folder. i have no idea why this is happening :(

Could it be windows? i have had problems with windows paths before

Update

  1. The Stanford NER version i have is 3.6.0. The zip file says stanford-ner-2015-12-09.zip

  2. I also tried using the stanford-ner-3.6.0.jar instead of stanford-ner.jar but still get the same error

  3. When i right click on the stanford-ner-3.6.0.jar, i notice

jar properties

i see this for all the files that i have extracted, even the slf4j files.could this be causing the problem?

  1. Finally, why does the error message say

java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory

i do not see any folder named org anywhere

Update: Env variables

Here are my env variables

CLASSPATH
.;
X:\jre1.8.0_60\lib\rt.jar;
X:\stanford\stanford-ner-3.6.0.jar;
X:\stanford\stanford-ner.jar;
X:\stanford\lib\slf4j-simple.jar;
X:\stanford\lib\slf4j-api.jar;
X:\slf4j\slf4j-1.7.13\slf4j-1.7.13\slf4j-log4j12-1.7.13.jar

STANFORD_MODELS
X:\stanford\classifiers

JAVA_HOME
X:\PROGRA~1\Java\JDK18~1.0_6

PATH
X:\PROGRA~1\Java\JDK18~1.0_6\bin;
X:\stanford;
X:\stanford\lib;
X:\slf4j\slf4j-1.7.13\slf4j-1.7.13

anything wrong here?

like image 502
AbtPst Avatar asked Dec 18 '15 18:12

AbtPst


5 Answers

NOTE:

Below is a temporal hack to work with:

  • NLTK version 3.1
  • Stanford NER compiled on 2015-12-09

This solution is NOT meant to be an eternal solution.

Always refer to https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software for the latest instruction on how to interface Stanford NLP tools using NLTK!!

Please track updates on this issue if you do not want to use this "hack": https://github.com/nltk/nltk/issues/1237 or please use the NER tool compield on 2015-04-20.


In Short

Make sure that you have:

  • NLTK version 3.1
  • Stanford NER compiled on 2015-12-09
  • Set the environment variables for CLASSPATH and STANFORD_MODELS

To set environment variables in Windows:

set CLASSPATH=%CLASSPATH%;C:\some\path\to\stanford-ner\stanford-ner.jar
set STANFORD_MODELS=%STANFORD_MODELS%;C:\some\path\to\stanford-ner\classifiers

To set environment variables in Linux:

export STANFORDTOOLSDIR=/home/some/path/to/stanfordtools/
export CLASSPATH=$STANFORDTOOLSDIR/stanford-ner-2015-12-09/stanford-ner.jar
export STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-ner-2015-12-09/classifiers

Then:

>>> from nltk.internals import find_jars_within_path
>>> from nltk.tag import StanfordNERTagger
>>> st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') 
# Note this is where your stanford_jar is saved.
# We are accessing the environment variables you've 
# set through the NLTK API.
>>> print st._stanford_jar
/home/alvas/stanford-ner-2015-12-09/stanford-ner.jar
>>> stanford_dir = st._stanford_jar.rpartition("\\")[0] # windows
# Note in linux you do this instead: 
>>> stanford_dir = st._stanford_jar.rpartition('/')[0] # linux
# Use the `find_jars_within_path` function to get all the
# jar files out from stanford NER tool under the libs/ dir.
>>> stanford_jars = find_jars_within_path(stanford_dir)
# Put the jars back into the `stanford_jar` classpath.
>>> st._stanford_jar = ':'.join(stanford_jars) # linux
>>> st._stanford_jar = ';'.join(stanford_jars) # windows
>>> st.tag('Rami Eid is studying at Stony Brook University in NY'.split())
[(u'Rami', u'PERSON'), (u'Eid', u'PERSON'), (u'is', u'O'), (u'studying', u'O'), (u'at', u'O'), (u'Stony', u'ORGANIZATION'), (u'Brook', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'in', u'O'), (u'NY', u'O')]
like image 145
alvas Avatar answered Nov 17 '22 08:11

alvas


I encountered exactly the same problem as you described yesterday.

There are 3 things you need to do.

1) Update your NLTK.

pip install -U nltk

Your version should be >3.1 and I see you are using

from nltk.tag.stanford import StanfordNERTagger

However, you gotta use the new module:

from nltk.tag import StanfordNERTagger

2) Download slf4j and update your CLASSPATH.

Here is how you update your CLASSPATH.

javapath = "/Users/aerin/Downloads/stanford-ner-2014-06-16/stanford-ner.jar:/Users/aerin/java/slf4j-1.7.13/slf4j-log4j12-1.7.13.jar"
os.environ['CLASSPATH'] = javapath 

As you see above, the javapath contains 2 paths, one is where stanford-ner.jar is, the other is where you downloaded slf4j-log4j12-1.7.13.jar (It can be downloaded here: http://www.slf4j.org/download.html)

3) Don't forget to specify where you downloaded 'english.all.3class.distsim.crf.ser.gz' & 'stanford-ner.jar'

st = StanfordNERTagger('/Users/aerin/Downloads/stanford-ner-2014-06-16/classifiers/english.all.3class.distsim.crf.ser.gz','/Users/aerin/Downloads/stanford-ner-2014-06-16/stanford-ner.jar') 

st.tag("Doneyo lab did such an awesome job!".split())
like image 23
aerin Avatar answered Nov 17 '22 07:11

aerin


i fixed!

u should indicate the full path of slf4j-api.jar in CLASSPATH

instead of add jar-path into system environment variable, u can do like this in code:

_CLASS_PATH = "."    
if os.environ.get('CLASSPATH') is not None:
    _CLASS_PATH = os.environ.get('CLASSPATH')
os.environ['CLASSPATH'] = _CLASS_PATH + ';F:\Python\Lib\slf4j\slf4j-api-1.7.13.jar'

important, in nltk/*/stanford.py will reset the classpath like this:

stdout, stderr = java(cmd, classpath=self._stanford_jar, stdout=PIPE, stderr=PIPE)

eg. \Python34\Lib\site-packages\nltk\tokenize\stanford.py line:90

u can fix it like this:

_CLASS_PATH = "."
if os.environ.get('CLASSPATH') is not None:
    _CLASS_PATH = os.environ.get('CLASSPATH')
stdout, stderr = java(cmd, classpath=(self._stanford_jar, _CLASS_PATH), stdout=PIPE, stderr=PIPE)
like image 27
Dylan Wang Avatar answered Nov 17 '22 09:11

Dylan Wang


Current Stanford NER tagger version is not compatible with nltk because it requires additional jars that nltk cannot add to the CLASSPATH.

Instead prefer an older version of Stanford NER Tagger that will works perfectly fine like this one: http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip

like image 2
nanarth Avatar answered Nov 17 '22 07:11

nanarth


For those who want to use Stanford NER >= 3.6.0 instead of the 2015-01-30 (3.5.1) or other old version, do this instead:

  1. Put the stanford-ner.jar and slf4j-api.jar into the same folder

    For example, I put the following files to /path-to-libs/

    • stanford-ner-3.6.0.jar
    • slf4j-api-1.7.18.jar
  2. Then:

    classpath = "/path-to-libs/*"
    
    st = nltk.tag.StanfordNERTagger(
        "/path-to-model/ner-model.ser.gz",
        "/path-to-libs/stanford-ner-3.6.0.jar"
    )
    st._stanford_jar = classpath
    result = st.tag(["Hello"])
    
like image 2
EwyynTomato Avatar answered Nov 17 '22 07:11

EwyynTomato