Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Limitation of group size in itertools.groupby

I'm looking for a solution how to limit group size of a group created by itertools.groupby.

Currently I have something like this:

>>> s = '555'
>>> grouped = groupby(s)
>>> print([(k, len(list(g))) for k, g in grouped])
[('5', 3)]

What I would like achieve is to have a max group's size=2, so my output would be:

[('5', 2), ('5', 1)]

Is there any easy and efficient way to do this? Maybe somehow by key argument provided to groupby?

like image 868
mchfrnc Avatar asked Jan 21 '26 09:01

mchfrnc


1 Answers

Here is a solution using groupby and a defaultdict.

from itertools import groupby
from collections import defaultdict

s = "5555444"
desired_length = 2
counts = defaultdict(int)

def count(x):
    global counts
    c = counts[x]
    counts[x] += 1
    return c

grouped = groupby(s, key=lambda x: (x, count(x) // desired_length))
print([(k[0], len(list(g))) for k, g in grouped])

I honestly think this solution is unacceptable, as it requires that you keep track of the global state at all times, but here it is. I would personally just use a buffer-like thing.

from collections import defaultdict
s = "5555444"

def my_buffer_function(sequence, desired_length):
    buffer = defaultdict(int)
    for item in sequence:
        buffer[item] += 1
        if buffer[item] == desired_length:
            yield (item, buffer.pop(item))
    for k, v in buffer.items():
        yield k, v

print(list(my_buffer_function(s, 2)))

This is also a generator. But it might miss some things groupby has that you currently rely on.

like image 191
amdex Avatar answered Jan 23 '26 00:01

amdex



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!