I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value):
for token in scan("a(b)"):
print token
would print
("literal", "a")
("l_paren", "(")
...
The next task implies parsing the token stream and for that, I need be able to look one item ahead from the current one without moving the pointer ahead as well. The fact that iterators and generators do not provide the complete sequence of items at once but each item as needed makes lookaheads a bit trickier compared to lists, since the next item is not known unless __next__()
is called.
What could a straightforward implementation of a generator-based lookahead look like? Currently I'm using a workaround which implies making a list out of the generator:
token_list = [token for token in scan(string)]
The lookahead then is easily implemented by something like that:
try:
next_token = token_list[index + 1]
except: IndexError:
next_token = None
Of course this just works fine. But thinking that over, my second question arises: Is there really a point of making scan()
a generator in the first place?
Pretty good answers there, but my favorite approach would be to use itertools.tee
-- given an iterator, it returns two (or more if requested) that can be advanced independently. It buffers in memory just as much as needed (i.e., not much, if the iterators don't get very "out of step" from each other). E.g.:
import itertools
import collections
class IteratorWithLookahead(collections.Iterator):
def __init__(self, it):
self.it, self.nextit = itertools.tee(iter(it))
self._advance()
def _advance(self):
self.lookahead = next(self.nextit, None)
def __next__(self):
self._advance()
return next(self.it)
You can wrap any iterator with this class, and then use the .lookahead
attribute of the wrapper to know what the next item to be returned in the future will be. I like to leave all the real logic to itertools.tee and just provide this thin glue!-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With