I am trying to implement lazy partitioning of an iterator object that yields slices of the iterator when a function on an element of the iterator changes values. This would mimick the behavior of Clojure's partition-by (although the semantics of the output would be different, since Python would genuinely "consume" the elements). My implementation is optimal in the number of operations it performs but not in memory it requires. I don't see why a good implementation would need more than O(1) memory, but my implementation takes up O(k) memory, where k is the size of the partition. I would like to be able to handle cases where k is large. Does anyone know of a good implementation?
The correct behavior should be something like
>>>unagi = [-1, 3, 4, 7, -2, 1, -3, -5]
>>> parts = partitionby(lambda x: x < 0,unagi)
>>> print [[y for y in x] for x in parts]
[[-1], [3, 4, 7], [-2], [1], [-3, -5]]
Here is my current version
from itertools import *
def partitionby(f,iterable):
seq = iter(iterable)
current = next(seq)
justseen = next(seq)
partition = iter([current])
while True:
if f(current) == f(justseen):
partition = chain(partition,iter([justseen]))
try:
justseen = next(seq)
except StopIteration:
yield partition
break
else:
yield partition
current = justseen
partition = iter([])
Why not reuse groupby
? I think it is O(1).
def partitionby(f, iterable):
return (g[1] for g in groupby(iterable, f))
The difference of groupby
's implementation with yours is that the partition
is a specialized iterator object, instead of a chain
of chain
of chain
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With