I am using Stanford parser with nltk in python and got help from Stanford Parser and NLTK to set up Stanford nlp libraries.
from nltk.parse.stanford import StanfordParser
from nltk.parse.stanford import StanfordDependencyParser
parser = StanfordParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
dep_parser = StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
one = ("John sees Bill")
parsed_Sentence = parser.raw_parse(one)
# GUI
for line in parsed_Sentence:
print line
line.draw()
parsed_Sentence = [parse.tree() for parse in dep_parser.raw_parse(one)]
print parsed_Sentence
# GUI
for line in parsed_Sentence:
print line
line.draw()
I am getting wrong parse and dependency trees as shown in the example below, it is treating 'sees' as noun instead of verb.
What should I do? It work perfectly right when I change sentence e.g.(one = 'John see Bill'). The correct ouput for this sentence can be viewed from here correct ouput of parse tree
Example of correct output is also shown below:
NLTK Parsers. Classes and interfaces for producing tree structures that represent the internal organization of a text. This task is known as “parsing” the text, and the resulting tree structures are called the text's “parses”.
The parser can read various forms of plain text input and can output various analysis formats, including part-of-speech tagged text, phrase structure trees, and a grammatical relations (typed dependency) format.
A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads.
Once again, no model is perfect (see Python NLTK pos_tag not returning the correct part-of-speech tag) ;P
You can try a "more accurate" parser, using the NeuralDependencyParser
.
First setup the parser properly with the correct environment variables (see Stanford Parser and NLTK and https://gist.github.com/alvations/e1df0ba227e542955a8a), then:
>>> from nltk.internals import find_jars_within_path
>>> from nltk.parse.stanford import StanfordNeuralDependencyParser
>>> parser = StanfordNeuralDependencyParser(model_path="edu/stanford/nlp/models/parser/nndep/english_UD.gz")
>>> stanford_dir = parser._classpath[0].rpartition('/')[0]
>>> slf4j_jar = stanford_dir + '/slf4j-api.jar'
>>> parser._classpath = list(parser._classpath) + [slf4j_jar]
>>> parser.java_options = '-mx5000m'
>>> sent = "John sees Bill"
>>> [parse.tree() for parse in parser.raw_parse(sent)]
[Tree('sees', ['John', 'Bill'])]
Do note that the NeuralDependencyParser
only produces the dependency trees:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With