I am trying to use NLTK interface for Stanford NER in the python enviornment, nltk.tag.stanford.NERTagger
.
from nltk.tag.stanford import NERTagger
st = NERTagger('/usr/share/stanford-ner/classifiers/all.3class.distsim.crf.ser.gz',
'/usr/share/stanford-ner/stanford-ner.jar')
st.tag('Rami Eid is studying at Stony Brook University in NY'.split())
I am supposed to get the output:
[('Rami', 'PERSON'), ('Eid', 'PERSON'), ('is', 'O'), ('studying', 'O'),
('at', 'O'), ('Stony', 'ORGANIZATION'), ('Brook', 'ORGANIZATION'),
('University', 'ORGANIZATION'), ('in', 'O'), ('NY', 'LOCATION')]
I have installed NLTK according the procedure described in the NLTK website. However, I can not find /usr/share/stanford-ner at all. Where and how do I find the whole package and install it in my directory.
There is no installation procedure, you should be able to run Stanford NER from that folder. Normally, Stanford NER is run from the command line (i.e., shell or terminal). Current releases of Stanford NER require Java 1.8 or later.
Stanford NER is also known as CRFClassifier. The software provides a general implementation of (arbitrary order) linear chain Conditional Random Field (CRF) sequence models. That is, by training your own models on labeled data, you can actually use this code to build sequence models for NER or any other task.
An alternative to NLTK's named entity recognition (NER) classifier is provided by the Stanford NER tagger. This tagger is largely seen as the standard in named entity recognition, but since it uses an advanced statistical learning algorithm it's more computationally expensive than the option provided by NLTK.
Just thought it would be worth mentioning that the import line is now:
from nltk.tag.stanford import StanfordNERTagger
It might be easier to look at the more recent interfaces to Stanford CoreNLP for python which are available here: http://nlp.stanford.edu/software/corenlp.shtml
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With