I want to parse some words and some numbers with pyparsing. Simple right.
from pyparsing import *
A = Word(nums).setResultsName('A')
B = Word(alphas).setResultsName('B')
expr = OneOrMore(A | B)
result = expr.parseString("123 abc 456 7 d")
print result
The code above prints ['123', 'abc', '456', '7', 'd']
. So everything worked. Now I want to do some work with these parsed values. For this task, I need to know if they matched A
or B
. Is there a way to distinguish between these two.
The only thing what I found after some research was the items
method of the ParseResults
class. But it only returns [('A', '7'), ('B', 'd')]
, only the last two matches.
My plan / goal is the following:
for elem in result:
if elem.is_of_type('A'):
# do stuff
elif elem.is_of_type('B'):
# do something else
How do I distinguish between A
and B
?
Nice job with getName(). You can also explicitly decorate the returned tokens with a marker, indicating which match was made:
def makeDecoratingParseAction(marker):
def parse_action_impl(s,l,t):
return (marker, t[0])
return parse_action_impl
A = Word(nums).setParseAction(makeDecoratingParseAction("A"))
B = Word(alphas).setParseAction(makeDecoratingParseAction("B"))
expr = OneOrMore(A | B)
result = expr.parseString("123 abc 456 7 d")
print result.asList()
Gives:
[('A', '123'), ('B', 'abc'), ('A', '456'), ('A', '7'), ('B', 'd')]
Now you can iterate over the returned tuples, and each one is labelled with the appropriate marker.
You can take this a step further and use a class to capture both the type and the type-specific post-parse logic, and then pass the class as the expression's parse action. This will create instances of the classes in the returned ParseResults, which you can then execute directly with some sort of exec
or doIt
method:
class ResultsHandler(object):
"""Define base class to initialize location and tokens.
Call subclass-specific post_init() if one is defined."""
def __init__(self, s,locn,tokens):
self.locn = locn
self.tokens = tokens
if hasattr(self, "post_init"):
self.post_init()
class AHandler(ResultsHandler):
"""Handler for A expressions, which contain a numeric string."""
def post_init(self):
self.int_value = int(self.tokens[0])
self.odd_even = ("EVEN","ODD")[self.int_value % 2]
def doIt(self):
print "An A-Type was found at %d with value %d, which is an %s number" % (
self.locn, self.int_value, self.odd_even)
class BHandler(ResultsHandler):
"""Handler for B expressions, which contain an alphabetic string."""
def post_init(self):
self.string = self.tokens[0]
self.vowels_count = sum(self.string.lower().count(c) for c in "aeiou")
def doIt(self):
print "A B-Type was found at %d with value %s, and contains %d vowels" % (
self.locn, self.string, self.vowels_count)
# pass expression-specific handler classes as parse actions
A = Word(nums).setParseAction(AHandler)
B = Word(alphas).setParseAction(BHandler)
expr = OneOrMore(A | B)
# parse string and run handlers
result = expr.parseString("123 abc 456 7 d")
for handler in result:
handler.doIt()
Prints:
An A-Type was found at 0 with value 123, which is an ODD number
A B-Type was found at 4 with value abc, and contains 1 vowels
An A-Type was found at 8 with value 456, which is an EVEN number
An A-Type was found at 12 with value 7, which is an ODD number
A B-Type was found at 14 with value d, and contains 0 vowels
I'm not entirely sure why, but in your .setResultsName()
calls, you need to specify listAllMatches=True
(it defaults to False
). Once you've done that, you can loop over result
and check if each token was matched by a given expression by checking for membership in the appropriate sub-thing of result
.
from pyparsing import *
# ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
A = Word(nums ).setResultsName('A', listAllMatches=True)
B = Word(alphas).setResultsName('B', listAllMatches=True)
expr = OneOrMore(A | B)
result = expr.parseString("123 abc 456 7 d")
for elem in result:
if elem in list(result['A']):
print(elem, 'is in A')
elif elem in list(result['B']):
print(elem, 'is in B')
This prints:
123 is in A
abc is in B
456 is in A
7 is in A
d is in B
This is kludgey, and I'm not sure if it's the canonically-correct way of doing this, but it seems to work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With