Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stanford Dependency Parser Setup and NLTK

So I got the "standard" Stanford Parser to work thanks to danger89's answers to this previous post, Stanford Parser and NLTK.

However, I am now trying to get the dependency parser to work and it seems the method highlighted in the previous link no longer works. Here is my code:

import nltk
import os
java_path = "C:\\Program Files\\Java\\jre1.8.0_51\\bin\\java.exe" 
os.environ['JAVAHOME'] = java_path


from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = 'path/jar'
os.environ['STANFORD_MODELS'] = 'path/jar'
parser = stanford.StanfordDependencyParser(model_path="path/jar/englishPCFG.ser.gz")

sentences = parser.raw_parse_sents(nltk.sent_tokenize("The iPod is expensive but pretty."))

I get the following error: 'module' object has no attribute 'StanfordDependencyParser'

The only thing I changed was "StanfordDependencyParser" from "StanfordParser". Any ideas how I can get this to work?

I also tried the Stanford Neural Dependency parser by importing it as shown in the documentation here: http://www.nltk.org/_modules/nltk/parse/stanford.html

This one didn't work either.

Pretty new to NLTK. Thanks in advance for any helpful input.

like image 359
Max Avatar asked Dec 02 '15 21:12

Max


People also ask

What is Stanford dependency parser?

Introduction. A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads. The figure below shows a dependency parse of a short sentence.

What is parsing in NLTK?

NLTK Parsers. Classes and interfaces for producing tree structures that represent the internal organization of a text. This task is known as “parsing” the text, and the resulting tree structures are called the text's “parses”.

What is Stanford NLP parser?

The Stanford Parser can be used to generate constituency and dependency parses of sentences for a variety of languages. The package includes PCFG, Shift Reduce, and Neural Dependency parsers. To fully utilize the parser, also make sure to download the models jar for the specific language you are interested in.


1 Answers

The StanfordDependencyParser API is a new class object created since NLTK version 3.1.

Ensure that you have the latest NLTK available either through pip

pip install -U nltk

or through your linux package manager, e.g.:

sudo apt-get python-nltk

or in windows, download https://pypi.python.org/pypi/nltk and install and it should overwrite your previous NLTK version.

Then you can use the API as shown in the documentation:

from nltk.parse.stanford import StanfordDependencyParser
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
print [parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]

[out]:

[Tree('jumps', [Tree('fox', ['The', 'quick', 'brown']), Tree('dog', ['over', 'the', 'lazy'])])]

(Note: Make sure you get your path to jar and os.environ correct, in Windows, it's something\\something\\some\\path, in unix it's something/something/some/path)

See also https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software#stanford-tagger-ner-tokenizer-and-parser and when you need a TL;DR solution, see https://github.com/alvations/nltk_cli

like image 145
alvas Avatar answered Oct 18 '22 02:10

alvas