itertools.groupby() not grouping correctly

Question

I have this data:

self.data = [(1, 1, 5.0),
             (1, 2, 3.0),
             (1, 3, 4.0),
             (2, 1, 4.0),
             (2, 2, 2.0)]

When I run this code:

for mid, group in itertools.groupby(self.data, key=operator.itemgetter(0)):

for list(group) I get:

[(1, 1, 5.0),
 (1, 2, 3.0),
 (1, 3, 4.0)]

which is what I want.

But if I use 1 instead of 0

for mid, group in itertools.groupby(self.data, key=operator.itemgetter(1)):

to group by the second number in the tuples, I only get:

[(1, 1, 5.0)]

even though there are other tuples that have "1" in that 1 (2nd) position.

unutbu · Accepted Answer

itertools.groupby collects together contiguous items with the same key. If you want all items with the same key, you have to sort self.data first.

for mid, group in itertools.groupby(
    sorted(self.data,key=operator.itemgetter(1)), key=operator.itemgetter(1)):

Konstantine Rybnikov · Answer

Variant without sorting (via dictionary). Should be better performance-wise.

def full_group_by(l, key=lambda x: x):
    d = defaultdict(list)
    for item in l:
        d[key(item)].append(item)
    return d.items()

Shital Shah · Answer

Below "fixes" several annoyances with Python's itertools.groupby.

def groupby2(l, key=lambda x:x, val=lambda x:x, agg=lambda x:x, sort=True):
    if sort:
        l = sorted(l, key=key)
    return ((k, agg((val(x) for x in v))) \
        for k,v in itertools.groupby(l, key=key))

Specifically,

It doesn't require that you sort your data.
It doesn't require that you must use key as named parameter only.
The output is clean generator of tuple(key, grouped_values) where values are specified by 3rd parameter.
Ability to apply aggregation functions like sum or avg easily.

Example Usage

import itertools
from operator import itemgetter
from statistics import *

t = [('a',1), ('b',2), ('a',3)]
for k,v in groupby2(t, itemgetter(0), itemgetter(1), sum):
  print(k, v)

This prints,

a 4
b 2

Play with this code

itertools.groupby() not grouping correctly

Tags:

python

itertools

python-itertools

user994165

3 Answers

unutbu

Konstantine Rybnikov

Shital Shah

Recent Activity

Donate For Us

itertools.groupby() not grouping correctly

Tags:

python

itertools

python-itertools

user994165

3 Answers

unutbu

Konstantine Rybnikov

Shital Shah

Related questions

Recent Activity

Donate For Us