Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove the duplicate values and sum the corresponding column values

I have a list from which I need to remove the duplicate values and sum the corresponding column values. The list is:

lst = [['20150815171000', '1', '2'],
       ['20150815171000', '2', '3'],
       ['20150815172000', '3', '4'],
       ['20150815172000', '4', '5'],
       ['20150815172000', '5', '6'],
       ['20150815173000', '6', '7']]

Now I need to traverse through the list and get the output something like this:

lst2 = [['20150815171000', '3', '5'], 
        ['20150815172000', '12', '15'], 
        ['20150815173000', '6', '7']]

How could this be done? I have tried writing the code as shown below but it's just comparing to consecutive values not, not all the matched ones.

    lst2 = []
    ws = wr = power = 0
    for i in range(len(lst)):
        if lst[i][0] == lst[i+1][0]:
            time = lst[i][0]
            ws = (float(lst[i][1])+float(lst[i+1][1]))
            wr = (float(lst[i][2])+float(lst[i+1][2]))      
        else:
           time = lst[i][0]
           ws = lst[i][1]
           wr = lst[i][2]
        lst2.append([time, ws, wr, power])

Can anyone let me know how can I do this?

like image 502
Vinod M S Avatar asked Sep 09 '15 09:09

Vinod M S


1 Answers

I would use itertools.groupby , grouping based on the first element on the inner list.

So I would first sort the list based on first element and then group based on it (If the list would already be sorted on that element, then you do not need to sort again , you can directly group) .

Example -

new_lst = []
for k,g in itertools.groupby(sorted(lst,key=lambda x:x[0]) , lambda x:x[0]):
    l = list(g)
    new_lst.append([k,str(sum([int(x[1]) for x in l])), str(sum([int(x[2]) for x in l]))])

Demo -

>>> import itertools
>>>
>>> lst = [['20150815171000', '1', '2'],
...        ['20150815171000', '2', '3'],
...        ['20150815172000', '3', '4'],
...        ['20150815172000', '4', '5'],
...        ['20150815172000', '5', '6'],
...        ['20150815173000', '6', '7']]
>>>
>>> new_lst = []
>>> for k,g in itertools.groupby(sorted(lst,key=lambda x:x[0]) , lambda x:x[0]):
...     l = list(g)
...     new_lst.append([k,str(sum([int(x[1]) for x in l])), str(sum([int(x[2]) for x in l]))])
...
>>> new_lst
[['20150815171000', '3', '5'], ['20150815172000', '12', '15'], ['20150815173000', '6', '7']]
like image 51
Anand S Kumar Avatar answered Oct 01 '22 09:10

Anand S Kumar