So I got the "standard" Stanford Parser to work thanks to danger89's answers to this previous post, Stanford Parser and NLTK.
However, I am now trying to get the dependency parser to work and it seems the method highlighted in the previous link no longer works. Here is my code:
import nltk
import os
java_path = "C:\\Program Files\\Java\\jre1.8.0_51\\bin\\java.exe"
os.environ['JAVAHOME'] = java_path
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = 'path/jar'
os.environ['STANFORD_MODELS'] = 'path/jar'
parser = stanford.StanfordDependencyParser(model_path="path/jar/englishPCFG.ser.gz")
sentences = parser.raw_parse_sents(nltk.sent_tokenize("The iPod is expensive but pretty."))
I get the following error: 'module' object has no attribute 'StanfordDependencyParser'
The only thing I changed was "StanfordDependencyParser" from "StanfordParser". Any ideas how I can get this to work?
I also tried the Stanford Neural Dependency parser by importing it as shown in the documentation here: http://www.nltk.org/_modules/nltk/parse/stanford.html
This one didn't work either.
Pretty new to NLTK. Thanks in advance for any helpful input.
Introduction. A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads. The figure below shows a dependency parse of a short sentence.
NLTK Parsers. Classes and interfaces for producing tree structures that represent the internal organization of a text. This task is known as “parsing” the text, and the resulting tree structures are called the text's “parses”.
The Stanford Parser can be used to generate constituency and dependency parses of sentences for a variety of languages. The package includes PCFG, Shift Reduce, and Neural Dependency parsers. To fully utilize the parser, also make sure to download the models jar for the specific language you are interested in.
The StanfordDependencyParser
API is a new class object created since NLTK version 3.1.
Ensure that you have the latest NLTK available either through pip
pip install -U nltk
or through your linux package manager, e.g.:
sudo apt-get python-nltk
or in windows, download https://pypi.python.org/pypi/nltk and install and it should overwrite your previous NLTK version.
Then you can use the API as shown in the documentation:
from nltk.parse.stanford import StanfordDependencyParser
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
print [parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]
[out]:
[Tree('jumps', [Tree('fox', ['The', 'quick', 'brown']), Tree('dog', ['over', 'the', 'lazy'])])]
(Note: Make sure you get your path to jar and os.environ
correct, in Windows, it's something\\something\\some\\path
, in unix it's something/something/some/path
)
See also https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software#stanford-tagger-ner-tokenizer-and-parser and when you need a TL;DR solution, see https://github.com/alvations/nltk_cli
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With