Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

aggregating an array of objects by attribute

I have a list of dicts, each with two key/value pairs. I need to combine dicts that share the same value for the first key, by summing the values of their second keys. For example:

[
    {'foo': 34, 'bar': 2}, 
    {'foo': 34, 'bar': 3}, 
    {'foo': 35, 'bar': 1}, 
    {'foo': 35, 'bar': 7}, 
    {'foo': 35, 'bar': 2}
]

would come out as:

[
    {'foo': 34, 'bar': 5}, 
    {'foo': 35, 'bar': 10}
]

I wrote the following function, which works, but seems horribly verbose, and I am almost sure there is a cool pythonic trick that would be cleaner, and more performant.

def combine(arr):
    arr_out = []
    if arr:
        arr_out.append({'foo': arr[0]['foo'], 'bar': 0})
        for i in range(len(arr)):
            if arr[i]['foo'] == arr_out[-1]['foo']:
                arr_out[-1]['bar'] += arr[i]['bar']
            else:
                arr_out.append({'foo': arr[i]['foo'], 'bar': arr[i]['bar']})
    return arr_out

Anyone have any suggestions?

like image 260
domoarigato Avatar asked May 25 '26 03:05

domoarigato


1 Answers

Using itertools.groupby:

>>> arr = [
...     {'foo': 34, 'bar': 2},
...     {'foo': 34, 'bar': 3},
...     {'foo': 35, 'bar': 1},
...     {'foo': 35, 'bar': 7},
...     {'foo': 35, 'bar': 2}
... ]
>>> import itertools
>>> key = lambda d: d['foo']
>>> [{'foo': key, 'bar': sum(d['bar'] for d in grp)}
...  for key, grp in itertools.groupby(sorted(arr, key=key), key=key)]
[{'foo': 34, 'bar': 5}, {'foo': 35, 'bar': 10}]

If the list is already sorted, you can omit sorted call:

>>> [{'foo': key, 'bar': sum(d['bar'] for d in grp)}
...  for key, grp in itertools.groupby(arr, key=key)]
[{'foo': 34, 'bar': 5}, {'foo': 35, 'bar': 10}]
like image 176
falsetru Avatar answered May 27 '26 10:05

falsetru