Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using StanfordParser to get typed dependencies from a parsed sentence

Using NLTK's StanfordParser, I can parse a sentence like this:

os.environ['STANFORD_PARSER'] = 'C:\jars' 
os.environ['STANFORD_MODELS'] = 'C:\jars'  
os.environ['JAVAHOME'] ='C:\ProgramData\Oracle\Java\javapath' 
parser = stanford.StanfordParser(model_path="C:\jars\englishPCFG.ser.gz")
sentences = parser.parse(("bring me a red ball",)) 
for sentence in sentences:
    sentence    

The result is:

Tree('ROOT', [Tree('S', [Tree('VP', [Tree('VB', ['Bring']),
Tree('NP', [Tree('DT', ['a']), Tree('NN', ['red'])]), Tree('NP',
[Tree('NN', ['ball'])])]), Tree('.', ['.'])])])

How can I use the Stanford parser to get typed dependencies in addition to the above graph? Something like:

  1. root(ROOT-0, bring-1)
  2. iobj(bring-1, me-2)
  3. det(ball-5, a-3)
  4. amod(ball-5, red-4)
  5. dobj(bring-1, ball-5)
like image 410
Yarik Avatar asked Apr 13 '15 10:04

Yarik


People also ask

What is dependency parsing explain with example?

In Dependency parsing, various tags represent the relationship between two words in a sentence. These tags are the dependency tags. For example, In the phrase 'rainy weather,' the word rainy modifies the meaning of the noun weather.

How does dependency parser work?

A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads.

What is dependency parse tree?

More formally, a dependency parse tree is a graph where the set of vertices contains the words in the sentence, and each edge in. connects two words. The graph must satisfy three conditions: There has to be a single root node with no incoming edges.

What does Stanford parser do?

The parser can read various forms of plain text input and can output various analysis formats, including part-of-speech tagged text, phrase structure trees, and a grammatical relations (typed dependency) format.


1 Answers

NLTK's StanfordParser module doesn't (currently) wrap the tree to Stanford Dependencies conversion code. You can use my library PyStanfordDependencies, which wraps the dependency converter.

If nltk_tree is sentence from the question's code snippet, then this works:

#!/usr/bin/python3
import StanfordDependencies

# Use str() to convert the NLTK tree to Penn Treebank format
penn_treebank_tree = str(nltk_tree) 

sd = StanfordDependencies.get_instance(jar_filename='point to Stanford Parser JAR file')
converted_tree = sd.convert_tree(penn_treebank_tree)

# Print Typed Dependencies
for node in converted_tree:
    print('{}({}-{},{}-{})'.format(
            node.deprel,
            converted_tree[node.head - 1].form if node.head != 0 else 'ROOT',
            node.head,
            node.form,
            node.index))
like image 189
dmcc Avatar answered Oct 23 '22 09:10

dmcc