I have a list in which each value is a list of tuples. for example this is the value which I extract for a key :
[('1998-01-20',8) , ('1998-01-22',4) , ('1998-06-18',8 ) , ('1999-07-15' , 7), ('1999-07-21',1) ]
I have also sorted the list. now I want to aggregate the values like this :
[('1998-01' , 12 ) , ('1998-06' ,8 ) , ('1999-07',8 )]
in some sense I want to group my tuples in terms of month , to sum the ints for each month together , I have read about groupby and I think it can't help me with my data structure because I have no idea what I'll be facing in my list so I'm trying to find a way to say : start from the first elements of the tuples if i[0][:6] are equal : sum i[1] . but I'm facing difficulty to implement this idea .
for i in List :
if i[0][:6] # *problem* I don't know how to say my condition :
s=sum(i[1]) #?
I would appreciate any advices as I'm a new user of python!
Try using itertools.groupby
to aggregate values by month:
from itertools import groupby
a = [('1998-01-20', 8), ('1998-01-22', 4), ('1998-06-18', 8),
('1999-07-15', 7), ('1999-07-21', 1)]
for key, group in groupby(a, key=lambda x: x[0][:7]):
print key, sum(j for i, j in group)
# Output
1998-01 12
1998-06 8
1999-07 8
Here's a one-liner version:
print [(key, sum(j for i, j in group)) for key, group in groupby(a, key=lambda x: x[0][:7])]
# Output
[('1998-01', 12), ('1998-06', 8), ('1999-07', 8)]
Just use defaultdict
:
from collections import defaultdict
DATA = [
('1998-01-20', 8),
('1998-01-22', 4),
('1998-06-18', 8),
('1999-07-15', 7),
('1999-07-21', 1),
]
groups = defaultdict(int)
for date, value in DATA:
groups[date[:7]] += value
from pprint import pprint
pprint(groups)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With