I have the following code in Python.
sent = [("very","ADJ"),("colourful","ADJ"),("ice","NN"),("cream","NN"),("van","NN")]
patterns= r"""
NP:{<ADJ>*<NN>+}
"""
NPChunker=nltk.RegexpParser(patterns) # create chunk parser
for s in NPChunker.nbest_parse(sent):
print s.draw()
The output is:
(S (NP very/ADJ colourful/ADJ ice/NN cream/NN van/NN))
But the output should have another 2 parse trees.
(S (NP very/ADJ colourful/ADJ ice/NN) (NP cream/NN) (NP van/NN))
(S (NP very/ADJ colourful/ADJ ice/NN cream/NN) van/NN)
The problem is that only the first regular expression is taken by the RegexpParser. How can I generate all possible parse trees at once?
This is not possible with the RegexpParser class. It inherits the nbest_parse method from the ParserI interface, and looking at the source code (https://github.com/nltk/nltk/blob/master/nltk/parse/api.py) it can be seen that it just defaults to running the parse method of the base class and returning that as an iterable.
As someone tried to explain in Chunking with nltk, the chunking classes are not the tool to use for this purpose (yet!), have a look at http://nltk.org/book/ch08.html, there are some quick examples, which would only take you halfway with what you want to achieve, necessitating a lot of pre-processing and smart design.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With