Simple grammar give ValueError in Python

Question

I'm new to Python, nltk and nlp. I have written simple grammar. But when running the program it gives below error. Please help me to solve this error

Grammar:-

S -> NP
NP -> PN|PRO|D[NUM=?n] N[NUM=?n]|D[NUM=?n] A N[NUM=?n]|D[NUM=?n] N[NUM=?n] PP|QP N[NUM=?n]|A N[NUM=?n]|D[NUM=?n] NOM PP|D[NUM=?n] NOM
PP -> P NP
D[NUM=sg] -> 'a'
D -> 'the'
N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
N[NUM=pl] -> 'dogs'|'cats'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
NOM -> A NOM|N[NUM=?n]

Code:-

import nltk

grammar = nltk.data.load('file:english_grammer.cfg')
rdparser = nltk.RecursiveDescentParser(grammar)
sent = "a dogs".split()
trees = rdparser.parse(sent)

for tree in trees: print (tree)

Error:-

alvas · Accepted Answer

I don't think NLTK CFG grammar readers can read the format of your CFG with square brackets.

First let's try a CFG grammar without the square brackets:

from nltk.grammar import CFG

grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
'''

grammar = CFG.fromstring(grammar_string)
print grammar

[out]:

Grammar with 18 productions (start state = S)
    S -> NP
    PP -> P NP
    D -> 'the'
    PN -> 'saumya'
    PN -> 'dinesh'
    PRO -> 'she'
    PRO -> 'he'
    PRO -> 'we'
    A -> 'tall'
    A -> 'naughty'
    A -> 'long'
    A -> 'three'
    A -> 'black'
    P -> 'with'
    P -> 'in'
    P -> 'from'
    P -> 'at'
    QP -> 'some'

Now let's put the square brackets in:

from nltk.grammar import CFG

grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
N[NUM=pl] -> 'dogs'|'cats'
'''

grammar = CFG.fromstring(grammar_string)
print grammar

[out]:

Traceback (most recent call last):
  File "test.py", line 33, in <module>
    grammar = CFG.fromstring(grammar_string)
  File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 519, in fromstring
    encoding=encoding)
  File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 1273, in read_grammar
    (linenum+1, line, e))
ValueError: Unable to parse line 10: N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
Expected an arrow

Going back to your grammar, it seems like you're using the square brackets to denote constraints or uncontraints, so the solution would be:

Using underscore for contrainted non-terminals and
to make a rule for unconstrainted non-terminals

So your cfg rules will look as such:

from nltk.parse import RecursiveDescentParser
from nltk.grammar import CFG

grammar_string = '''
S -> NP
NP -> PN | PRO | D N | D A N | D N PP | QP N | A N | D NOM PP | D NOM

PP -> P NP
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'

D -> D_def | D_sg
D_def -> 'the'
D_sg -> 'a'

N -> N_sg | N_pl
N_sg -> 'boy'|'girl'|'room'|'garden'|'hair'
N_pl -> 'dogs'|'cats'
'''

grammar = CFG.fromstring(grammar_string)

rdparser = RecursiveDescentParser(grammar)
sent = "a dogs".split()
trees = rdparser.parse(sent)

for tree in trees:
    print (tree)

[out]:

(S (NP (D (D_sg a)) (N (N_pl dogs))))

dantiston · Answer

It looks like you're trying to use NLTK's feature grammars, which do use the square bracket syntax to denote features and feature agreement. NLTK's parser to use feature grammars is the FeatureEarleyChartParser (as opposed to RecursiveDescentParser).

From the NLTK documentation:

>>> from __future__ import print_function
>>> import nltk
>>> from nltk import grammar, parse
>>> g = """
... % start DP
... DP[AGR=?a] -> D[AGR=?a] N[AGR=?a]
... D[AGR=[NUM='sg', PERS=3]] -> 'this' | 'that'
... D[AGR=[NUM='pl', PERS=3]] -> 'these' | 'those'
... D[AGR=[NUM='pl', PERS=1]] -> 'we'
... D[AGR=[PERS=2]] -> 'you'
... N[AGR=[NUM='sg', GND='m']] -> 'boy'
... N[AGR=[NUM='pl', GND='m']] -> 'boys'
... N[AGR=[NUM='sg', GND='f']] -> 'girl'
... N[AGR=[NUM='pl', GND='f']] -> 'girls'
... N[AGR=[NUM='sg']] -> 'student'
... N[AGR=[NUM='pl']] -> 'students'
... """
>>> grammar = grammar.FeatureGrammar.fromstring(g)
>>> tokens = 'these girls'.split()
>>> parser = parse.FeatureEarleyChartParser(grammar)
>>> trees = parser.parse(tokens)
>>> for tree in trees: print(tree)
(DP[AGR=[GND='f', NUM='pl', PERS=3]]
  (D[AGR=[NUM='pl', PERS=3]] these)
  (N[AGR=[GND='f', NUM='pl']] girls))

Simple grammar give ValueError in Python

Tags:

python-3.x

nlp

nltk

Chandana Indisooriya

2 Answers

alvas

dantiston

Recent Activity

Donate For Us

Simple grammar give ValueError in Python

Tags:

python-3.x

nlp

nltk

Chandana Indisooriya

2 Answers

alvas

dantiston

Related questions

Recent Activity

Donate For Us