There's one in an old version of the Python docs with itertools
examples:
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
The one from the docs is a little more succinct and uses itertools
to greater effect I imagine.
If your iterator is a simple list/tuple a simple way to slide through it with a specified window size would be:
seq = [0, 1, 2, 3, 4, 5]
window_size = 3
for i in range(len(seq) - window_size + 1):
print(seq[i: i + window_size])
Output:
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
This seems tailor-made for a collections.deque
since you essentially have a FIFO (add to one end, remove from the other). However, even if you use a list
you shouldn't be slicing twice; instead, you should probably just pop(0)
from the list and append()
the new item.
Here is an optimized deque-based implementation patterned after your original:
from collections import deque
def window(seq, n=2):
it = iter(seq)
win = deque((next(it, None) for _ in xrange(n)), maxlen=n)
yield win
append = win.append
for e in it:
append(e)
yield win
In my tests it handily beats everything else posted here most of the time, though pillmuncher's tee
version beats it for large iterables and small windows. On larger windows, the deque
pulls ahead again in raw speed.
Access to individual items in the deque
may be faster or slower than with lists or tuples. (Items near the beginning are faster, or items near the end if you use a negative index.) I put a sum(w)
in the body of my loop; this plays to the deque's strength (iterating from one item to the next is fast, so this loop ran a a full 20% faster than the next fastest method, pillmuncher's). When I changed it to individually look up and add items in a window of ten, the tables turned and the tee
method was 20% faster. I was able to recover some speed by using negative indexes for the last five terms in the addition, but tee
was still a little faster. Overall I would estimate that either one is plenty fast for most uses and if you need a little more performance, profile and pick the one that works best.
I like tee()
:
from itertools import tee, izip
def window(iterable, size):
iters = tee(iterable, size)
for i in xrange(1, size):
for each in iters[i:]:
next(each, None)
return izip(*iters)
for each in window(xrange(6), 3):
print list(each)
gives:
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
There is a library which does exactly what you need:
import more_itertools
list(more_itertools.windowed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],n=3, step=3))
Out: [(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, 15)]
Here's a generalization that adds support for step
, fillvalue
parameters:
from collections import deque
from itertools import islice
def sliding_window(iterable, size=2, step=1, fillvalue=None):
if size < 0 or step < 1:
raise ValueError
it = iter(iterable)
q = deque(islice(it, size), maxlen=size)
if not q:
return # empty iterable or size == 0
q.extend(fillvalue for _ in range(size - len(q))) # pad to size
while True:
yield iter(q) # iter() to avoid accidental outside modifications
try:
q.append(next(it))
except StopIteration: # Python 3.5 pep 479 support
return
q.extend(next(it, fillvalue) for _ in range(step - 1))
It yields in chunks size
items at a time rolling step
positions per iteration padding each chunk with fillvalue
if necessary. Example for size=4, step=3, fillvalue='*'
:
[a b c d]e f g h i j k l m n o p q r s t u v w x y z
a b c[d e f g]h i j k l m n o p q r s t u v w x y z
a b c d e f[g h i j]k l m n o p q r s t u v w x y z
a b c d e f g h i[j k l m]n o p q r s t u v w x y z
a b c d e f g h i j k l[m n o p]q r s t u v w x y z
a b c d e f g h i j k l m n o[p q r s]t u v w x y z
a b c d e f g h i j k l m n o p q r[s t u v]w x y z
a b c d e f g h i j k l m n o p q r s t u[v w x y]z
a b c d e f g h i j k l m n o p q r s t u v w x[y z * *]
For an example of use case for the step
parameter, see Processing a large .txt file in python efficiently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With