Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use python groupby() [duplicate]

When I try to use itertools.groupby to group a list of numbers like this:

from itertools import groupby

a = [1, 2, 1, 3, 2, 1, 2, 3, 4, 5]

for key, value in groupby(a):
    print((len(list(value)), key), end=' ')

The output is

(1, 1) (1, 2) (1, 1) (1, 3) (1, 2) (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) 

instead of

(3, 1) (3, 2) (2, 3) (1, 4) (1, 5)

Why doesn't it group identical numbers correctly?

like image 319
mohitmonu Avatar asked Jun 27 '17 07:06

mohitmonu


People also ask

Can you use Groupby with multiple columns in pandas?

How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.

What is possible using Groupby () method of pandas?

groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.

How do you do Groupby in Python?

You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .


3 Answers

Grouping input by common key elements with groupby() only works on input already sorted by that key:

[...] Generally, the iterable needs to already be sorted on the same key function.

Your example should work like this:

from itertools import groupby

a = sorted([1, 2, 1, 3, 2, 1, 2, 3, 4, 5])

for key, value in groupby(a):
    print((len(list(value)), key), end=' ')

If you use groupby() on unorderd input you'll get a new group every time a different key is returned by the key function while iterating through the iterable.

like image 159
mata Avatar answered Oct 16 '22 12:10

mata


Based on your output requirement, I'll change your question. collections.Counter is simple to use here

from collections import Counter

a = [1, 2, 1, 3, 2, 1, 2, 3, 4, 5]

[ (v, k) for k, v in Counter(a).items() ]
like image 25
Transhuman Avatar answered Oct 16 '22 10:10

Transhuman


itertools.groupby only group the consecutive elements. So you need to sort before doing groupby.

from itertools import groupby

a = sorted([1, 2, 1, 3, 2, 1, 2, 3, 4, 5])

for key, value in groupby(a):
    print((len(list(value)), key), end=' ')

Result

(3, 1)
(3, 2)
(2, 3)
(1, 4)
(1, 5)
like image 38
Rahul K P Avatar answered Oct 16 '22 10:10

Rahul K P