I get following result when i execute stanford parser from nltk.
(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))
but i need it in the form
S -> VP
VP -> VB NP ADVP
VB -> get
PRP -> me
RB -> now
How can I get this result, perhaps using recursive function. Is there in-built function already?
First to navigate a tree, see How to iterate through all nodes of a tree? and How to navigate a nltk.tree.Tree? :
>>> from nltk.tree import Tree
>>> bracket_parse = "(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))"
>>> ptree = Tree.fromstring(bracket_parse)
>>> ptree
Tree('S', [Tree('VP', [Tree('VB', ['get']), Tree('NP', [Tree('PRP', ['me'])]), Tree('ADVP', [Tree('RB', ['now'])])])])
>>> for subtree in ptree.subtrees():
... print subtree
...
(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))
(VP (VB get) (NP (PRP me)) (ADVP (RB now)))
(VB get)
(NP (PRP me))
(PRP me)
(ADVP (RB now))
(RB now)
And what you're looking for is https://github.com/nltk/nltk/blob/develop/nltk/tree.py#L341:
>>> ptree.productions()
[S -> VP, VP -> VB NP ADVP, VB -> 'get', NP -> PRP, PRP -> 'me', ADVP -> RB, RB -> 'now']
Note that Tree.productions()
returns a Production
object, see https://github.com/nltk/nltk/blob/develop/nltk/tree.py#L22 and https://github.com/nltk/nltk/blob/develop/nltk/grammar.py#L236.
If you want a string form of the grammar rules, you can either do:
>>> for rule in ptree.productions():
... print rule
...
S -> VP
VP -> VB NP ADVP
VB -> 'get'
NP -> PRP
PRP -> 'me'
ADVP -> RB
RB -> 'now'
Or
>>> rules = [str(p) for p in ptree.productions()]
>>> rules
['S -> VP', 'VP -> VB NP ADVP', "VB -> 'get'", 'NP -> PRP', "PRP -> 'me'", 'ADVP -> RB', "RB -> 'now'"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With