Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pyparsing error

I am stuck at this error in pyparsing

from pyparsing import Word,alphas,nums,Or,Regex,StringEnd
ws = Regex('\s*')
dot = "."
w = Word(alphas) + (ws | dot) + StringEnd()
w.leaveWhitespace()
w.parseString('AMIT.')

Returns the following error:

ParseException: Expected end of text (at char 4), (line:1, col:5)
like image 525
lesnar_56 Avatar asked Feb 08 '12 06:02

lesnar_56


1 Answers

| creates a "match first" expression, not "match longest".

The first alternative is the regex, which will match 0 or more whitespace characters. This, in fact, does match, so the dot is not parsed.

Then the next element to parse is StringEnd, but the parse position is still located at the '.'—so, fail!

Here is some more detailed output by adding setDebug() calls to your grammar expressions:

>>> w = Word(alphas).setDebug() + (ws.setDebug() | dot.setDebug()) + StringEnd()
>>> w.parseString('AMIT.')
Match W:(abcd...) at loc 0(1,1)
Matched W:(abcd...) -> ['AMIT']
Match Re:('\\s*') at loc 4(1,5)
Matched Re:('\\s*') -> ['']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\python26\lib\site-packages\pyparsing-1.5.6-py2.6.egg\pyparsing.py", line 1032, in parseString
    raise exc
pyparsing.ParseException: Expected end of text (at char 4), (line:1, col:5)

To get your grammar to work you could:

  • change the | operator to ^ (match longest instead of match first)

  • change the regex to \s+ instead of \s* (so that at least one space was required for a match)

  • change your second term to Optional(dot)

In general, explicit testing for whitespace is not consistent with the pyparsing philosophy—pyparsing is not the same as re.

like image 189
PaulMcG Avatar answered Oct 30 '22 19:10

PaulMcG