How to generate multiple parse trees for an ambiguous sentence in NLTK?

Question

I have the following code in Python.

sent = [("very","ADJ"),("colourful","ADJ"),("ice","NN"),("cream","NN"),("van","NN")] 
patterns= r"""
  NP:{<ADJ>*<NN>+}  

"""
NPChunker=nltk.RegexpParser(patterns) # create chunk parser
for s in NPChunker.nbest_parse(sent):
    print s.draw()

The output is:

(S (NP very/ADJ colourful/ADJ ice/NN cream/NN van/NN))

But the output should have another 2 parse trees.

(S (NP very/ADJ colourful/ADJ ice/NN) (NP cream/NN) (NP van/NN))
(S (NP very/ADJ colourful/ADJ ice/NN cream/NN) van/NN)

The problem is that only the first regular expression is taken by the RegexpParser. How can I generate all possible parse trees at once?

Viktor Vojnovski · Accepted Answer

This is not possible with the RegexpParser class. It inherits the nbest_parse method from the ParserI interface, and looking at the source code (https://github.com/nltk/nltk/blob/master/nltk/parse/api.py) it can be seen that it just defaults to running the parse method of the base class and returning that as an iterable.

As someone tried to explain in Chunking with nltk, the chunking classes are not the tool to use for this purpose (yet!), have a look at http://nltk.org/book/ch08.html, there are some quick examples, which would only take you halfway with what you want to achieve, necessitating a lot of pre-processing and smart design.

How to generate multiple parse trees for an ambiguous sentence in NLTK?

Tags:

python

regex

nlp

nltk

gamma

1 Answers

Viktor Vojnovski

Recent Activity

Donate For Us

How to generate multiple parse trees for an ambiguous sentence in NLTK?

Tags:

python

regex

nlp

nltk

gamma

1 Answers

Viktor Vojnovski

Related questions

Recent Activity

Donate For Us