Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implement lookahead iterator for strings in Python

I'm doing some parsing that requires one token of lookahead. What I'd like is a fast function (or class?) that would take an iterator and turn it into a list of tuples in the form (token, lookahead), such that:

>>> a = ['a', 'b', 'c', 'd']
>>> list(lookahead(a))
[('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', None)]

basically, this would be handy for looking ahead in iterators like this:

for (token, lookahead_1) in lookahead(a):
  pass

Though, I'm not sure if there's a name for this technique or function in itertools that already will do this. Any ideas?

Thanks!

like image 382
Scott Avatar asked Jun 23 '11 00:06

Scott


1 Answers

There are easier ways if you are just using lists - see Sven's answer. Here is one way to do it for general iterators

>>> from itertools import tee, izip_longest
>>> a = ['a', 'b', 'c', 'd']
>>> it1, it2 = tee(iter(a))
>>> next(it2)  # discard this first value
'a'
>>> [(x,y) for x,y in izip_longest(it1, it2)]
    # or just list(izip_longest(it1, it2))
[('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', None)]

Here's how to use it in a for loop like in your question.

>>> it1,it2 = tee(iter(a))
>>> next(it2)
'a'
>>> for (token, lookahead_1) in izip_longest(it1,it2):
...     print token, lookahead_1
... 
a b
b c
c d
d None

Finally, here's the function you are looking for

>>> def lookahead(it):
...     it1, it2 = tee(iter(it))
...     next(it2)
...     return izip_longest(it1, it2)
... 
>>> for (token, lookahead_1) in lookahead(a):
...     print token, lookahead_1
... 
a b
b c
c d
d None
like image 186
John La Rooy Avatar answered Oct 29 '22 02:10

John La Rooy