Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NLTK: why does nltk not recognize the CLASSPATH variable for stanford-ner?

This is my code

from nltk.tag import StanfordNERTagger
st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz')

And i get

NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
  environment variable.

This is what my .bashrc looks like in ubuntu

export CLASSPATH=/home/wolfgang/Downloads/stanford-ner-2015-04-20/stanford-ner-3.5.2.jar
export STANFORD_MODELS=/home/wolfgang/Downloads/stanford-ner-2015-04-20/classifiers

Also, i tried printing the environmental variable in python this way

import os
os.environ.get('CLASSPATH')

And i recieve

'/home/wolfgang/Downloads/stanford-ner-2015-04-20/stanford-ner-3.5.2.jar'

Therefore the variables are being SET!

What is wrong then?

Why doe'snt nltk recognize my environmental variables?

like image 398
chapman Avatar asked Sep 28 '15 09:09

chapman


1 Answers

change the .jar file and the environmental variable from stanford-ner-3.5.2.jar to stanford-ner.jar

apparently NLTK has a name_pattern variable in nltk_internals.py which only accepts the CLASSPATH if it matches a regex of the value stanford-ner.jar

like image 108
wolfgang Avatar answered Sep 22 '22 13:09

wolfgang