I have a list of lists (string,integer)
eg:
my_list=[["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
I'd like to sum the same items and finally get this:
my2_list=[["apple",116],["banana",15],["orange",9]]
You can use itertools.groupby
on the sorted list:
from itertools import groupby
my_list=[["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
my_list2 = []
for i, g in groupby(sorted(my_list), key=lambda x: x[0]):
my_list2.append([i, sum(v[1] for v in g)])
print(my_list2)
# [['apple', 116], ['banana', 15], ['orange', 9]]
Speaking of SQL Group By and pre-sorting:
The operation of
groupby()
is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.
Emphasis Mine
from collections import defaultdict
my_list= [["apple",5],["banana",6],["orange",6],["banana",9],["orange",3],["apple",111]]
result = defaultdict(int)
for fruit, value in my_list:
result[fruit] += value
result = result.items()
print result
Or you can keep result as dictionary
Using Pandas and groupby
:
import pandas as pd
>>> pd.DataFrame(my_list, columns=['fruit', 'count']).groupby('fruit').sum()
count
fruit
apple 116
banana 15
orange 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With