Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to group items in a list [duplicate]

Consider a list of dicts:

items = [
    {'a': 1, 'b': 9, 'c': 8},
    {'a': 1, 'b': 5, 'c': 4},
    {'a': 2, 'b': 3, 'c': 1},
    {'a': 2, 'b': 7, 'c': 9},
    {'a': 3, 'b': 8, 'c': 2}
]

Is there a pythonic way to extract and group these items by their a field, such that:

result = {
    1 : [{'b': 9, 'c': 8}, {'b': 5, 'c': 4}]
    2 : [{'b': 3, 'c': 1}, {'b': 7, 'c': 9}]
    3 : [{'b': 8, 'c': 2}]
}

References to any similar Pythonic constructs are appreciated.

like image 700
Yuval Adam Avatar asked Apr 22 '26 12:04

Yuval Adam


2 Answers

Use itertools.groupby:

>>> from itertools import groupby
>>> from operator import itemgetter
>>> {k: list(g) for k, g in groupby(items, itemgetter('a'))}
{1: [{'a': 1, 'c': 8, 'b': 9},
     {'a': 1, 'c': 4, 'b': 5}],
 2: [{'a': 2, 'c': 1, 'b': 3},
     {'a': 2, 'c': 9, 'b': 7}],
 3: [{'a': 3, 'c': 2, 'b': 8}]}

If item are not in sorted order then you can either sort them and then use groupby or you can use collections.OrderedDict(if order matters) or collections.defaultdict to do it in O(N) time:

>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> for item in items:
...     d.setdefault(item['a'], []).append(item)
...     
>>> dict(d.items())
{1: [{'a': 1, 'c': 8, 'b': 9},
     {'a': 1, 'c': 4, 'b': 5}],
 2: [{'a': 2, 'c': 1, 'b': 3},
     {'a': 2, 'c': 9, 'b': 7}],
 3: [{'a': 3, 'c': 2, 'b': 8}]}

Update:

I see that you only want the those keys to be returned that we didn't use for grouping, for that you'll need to do something like this:

>>> group_keys = {'a'}
>>> {k:[{k:d[k] for k in d.viewkeys() - group_keys} for d in g]
                                   for k, g in groupby(items, itemgetter(*group_keys))}
{1: [{'c': 8, 'b': 9},
     {'c': 4, 'b': 5}],
 2: [{'c': 1, 'b': 3},
     {'c': 9, 'b': 7}],
 3: [{'c': 2, 'b': 8}]}
like image 95
Ashwini Chaudhary Avatar answered Apr 24 '26 01:04

Ashwini Chaudhary


Note: This code assumes the the data is already sorted. If it is not, we have to sort it manually

from itertools import groupby
print {key:list(grp) for key, grp in groupby(items, key=lambda x:x["a"])}

Output

{1: [{'a': 1, 'b': 9, 'c': 8}, {'a': 1, 'b': 5, 'c': 4}],
 2: [{'a': 2, 'b': 3, 'c': 1}, {'a': 2, 'b': 7, 'c': 9}],
 3: [{'a': 3, 'b': 8, 'c': 2}]}

To get the result in the same format you asked for,

from itertools import groupby
from operator import itemgetter
a_getter, getter, keys = itemgetter("a"), itemgetter("b", "c"), ("b", "c")

def recon_dicts(items):
    return dict(zip(keys, getter(items)))

{key: map(recon_dicts, grp) for key, grp in groupby(items, key=a_getter)}

Output

{1: [{'c': 8, 'b': 9}, {'c': 4, 'b': 5}],
 2: [{'c': 1, 'b': 3}, {'c': 9, 'b': 7}],
 3: [{'c': 2, 'b': 8}]}

If the data is not sorted already, you can either use the defaultdict method in this answer, or you can use sorted function to sort based on a, like this

{key: map(recon_dicts, grp)
   for key, grp in groupby(sorted(items, key=a_getter), key=a_getter)}

References:

  1. operator.itemgetter

  2. itertools.groupby

  3. zip, map, dict, sorted

like image 20
thefourtheye Avatar answered Apr 24 '26 00:04

thefourtheye