I have a sentence John saw a flashy hat at the store
How to represent this as a dependency tree as shown below?
(S
(NP (NNP John))
(VP
(VBD saw)
(NP (DT a) (JJ flashy) (NN hat))
(PP (IN at) (NP (DT the) (NN store)))))
I got this script from here
import spacy
from nltk import Tree
en_nlp = spacy.load('en')
doc = en_nlp("John saw a flashy hat at the store")
def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
else:
return node.orth_
[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]
I am getting the following but I am looking for a tree(NLTK) format.
saw
____|_______________
| | at
| | |
| hat store
| ___|____ |
John a flashy the
Dependency Parsing Using spaCyIt defines the dependency relationship between headwords and their dependents. The head of a sentence has no dependency and is called the root of the sentence. The verb is usually the head of the sentence. All other words are linked to the headword.
spaCy features a fast and accurate syntactic dependency parser, and has a rich API for navigating the tree. The parser also powers the sentence boundary detection, and lets you iterate over base noun phrases, or “chunks”. You can check whether a Doc object has been parsed by calling doc.
dep_ property of each child token describes its relationship with its parent; for instance a dep_ of 'nsubj' means that a token is the nominal subject of its parent.
PRON : pronoun, e.g I, you, he, she, myself, themselves, somebody. PROPN : proper noun, e.g. Mary, John, London, NATO, HBO. PUNCT : punctuation, e.g. ., (, ), ? SCONJ : subordinating conjunction, e.g. if, while, that.
To re-create an NLTK-style tree for SpaCy dependency parses, try using the draw
method from nltk.tree
instead of pretty_print
:
import spacy
from nltk.tree import Tree
spacy_nlp = spacy.load("en")
def nltk_spacy_tree(sent):
"""
Visualize the SpaCy dependency tree with nltk.tree
"""
doc = spacy_nlp(sent)
def token_format(token):
return "_".join([token.orth_, token.tag_, token.dep_])
def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(token_format(node),
[to_nltk_tree(child)
for child in node.children]
)
else:
return token_format(node)
tree = [to_nltk_tree(sent.root) for sent in doc.sents]
# The first item in the list is the full tree
tree[0].draw()
Note that because SpaCy only currently supports dependency parsing and tagging at the word and noun-phrase level, SpaCy trees won't be as deeply structured as the ones you'd get from, for instance, the Stanford parser, which you can also visualize as a tree:
from nltk.tree import Tree
from nltk.parse.stanford import StanfordParser
# Note: Download Stanford jar dependencies first
# See https://stackoverflow.com/questions/13883277/stanford-parser-and-nltk
stanford_parser = StanfordParser(
model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz"
)
def nltk_stanford_tree(sent):
"""
Visualize the Stanford dependency tree with nltk.tree
"""
parse = stanford_parser.raw_parse(sent)
tree = list(parse)
# The first item in the list is the full tree
tree[0].draw()
Now if we run both, nltk_spacy_tree("John saw a flashy hat at the store.")
will produce this image and nltk_stanford_tree("John saw a flashy hat at the store.")
will produce this one.
Text representations aside, what you're trying to achieve is to get a constituency tree out of a dependency graph. Your example of desired output is a classic constituency tree (as in phrase structure grammar, as opposed to dependency grammar).
While the conversion from constituency trees into dependency graphs is more-or-less an automated task (for instance, http://www.mathcs.emory.edu/~choi/doc/clear-dependency-2012.pdf), the other direction is not. There have been works on that, check out the PAD project https://github.com/ikekonglp/PAD and the paper describing the underlying algorithm: http://homes.cs.washington.edu/~nasmith/papers/kong+rush+smith.naacl15.pdf.
You may also want to reconsider if you really need a constituency parse, here is a good argument: https://linguistics.stackexchange.com/questions/7280/why-is-constituency-needed-since-dependency-gets-the-job-done-more-easily-and-e
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With