Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby multiple columns in a list

Tags:

python

I have a list of list like below

[['H1','L', '1']
['H1','S', '1']
['H2','L', '1']
['H2','L', '1']]

And want grouping based on column1 and column2. Does python provide anything in lists that i can get the below result

H1 L 1
H1 S 1
H2 L 2
like image 246
Ajay Singh Avatar asked Feb 05 '18 09:02

Ajay Singh


People also ask

Can you Groupby multiple columns Python?

How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.

How do I group multiple columns?

A shortcut way to group rows or columns is to highlight the rows/columns you wish to group and use ALT+SHIFT+RIGHT ARROW to group the rows/columns, and ALT+SHIFT+LEFT ARROW to ungroup them. You can go multiple levels as well (so you could group rows 1-30, and then group rows 20-25 as a subgroup of the first).

How do you Groupby a list in Python?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.


3 Answers

You can use itertools.groupby, and the sum up the last column of each group.

from itertools import groupby

out = []
for k, v in groupby(l, key=lambda x: x[:2]):
    s = sum([int(x[-1]) for x in v])
    out.append(k + [s])

print (out)
# [['H1', 'L', 1], ['H1', 'S', 1], ['H2', 'L', 2]]
like image 71
Ghilas BELHADJ Avatar answered Oct 19 '22 17:10

Ghilas BELHADJ


You can use itertools.groupby along with operator.itemgetter to achieve your desired results

>>> from operator import itemgetter
>>> from itertools import groupby

>>> items = [['H1','L', '1'], ['H1','S', '1'], ['H2','L', '1'], ['H2','L', '1']]
>>> [(*k,sum([int(itemgetter(2)(i)) for i in list(g)])) for k,g in groupby(items,key=itemgetter(0,1))]
>>> [('H1', 'L', 1), ('H1', 'S', 1), ('H2', 'L', 2)]
like image 4
Sohaib Farooqi Avatar answered Oct 19 '22 19:10

Sohaib Farooqi


Another option is to use pandas:

import pandas as pd
df = pd.DataFrame([['H1','L', 1],['H1','S', 1],['H2','L', 1],['H2','L', 1]],columns=['H','LS','1'])
df.groupby(['H','LS']).sum()

returning

       1
H  LS
H1 L   1
   S   1
H2 L   2

or

>>> df.groupby(['H','LS']).sum().reset_index()
    H LS  1
0  H1  L  1
1  H1  S  1
2  H2  L  2
like image 1
Dan Avatar answered Oct 19 '22 19:10

Dan