Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python list sort by size of group

I have a group of items that are labeled like item_labels = [('a', 3), ('b', 2), ('c', 1), ('d', 3), ('e', 2), ('f', 3)]

I want to sort them by the size of group. e.g., label 3 has size 3 and label 2 has size 2 in the above example.

I tried using a combination of groupby and sorted but didn't work.

In [162]: sil = sorted(item_labels, key=op.itemgetter(1))

In [163]: sil
Out[163]: [('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]

In [164]: g = itt.groupby(sil,)
Display all 465 possibilities? (y or n)

In [164]: g = itt.groupby(sil, key=op.itemgetter(1))

In [165]: for k, v in g:
   .....:     print k, list(v)
   .....:
   .....:
1 [('c', 1)]
2 [('b', 2), ('e', 2)]
3 [('a', 3), ('d', 3), ('f', 3)]

In [166]: sg = sorted(g, key=lambda x: len(list(x[1])))

In [167]: sg
Out[167]: [] # not exactly know why I got an empty list here

I can always write some tedious for-loop to do this, but I would rather find something more elegant. Any suggestion? If there are libraries that are useful I would happy to use that. e.g., pandas, scipy

like image 557
clwen Avatar asked Dec 21 '22 04:12

clwen


1 Answers

In python2.7 and above, use Counter:

from collections import Counter
c = Counter(y for _, y in item_labels)
item_labels.sort(key=lambda t : c[t[1]])

In python2.6, for our purpose, this Counter constructor can be implemented using defaultdict (as suggested by @perreal) this way:

from collections import defaultdict
def Counter(x):
    d = defaultdict(int)
    for v in x: d[v]+=1
    return d

Since we are working with numbers only, and assuming the numbers are as low as those in your example, we can actually use a list (which will be compatible with even older version of Python):

def Counter(x):
    lst = list(x)
    d = [0] * (max(lst)+1)
    for v in lst: d[v]+=1
    return d

Without counter, you can simply do this:

item_labels.sort(key=lambda t : len([x[1] for x in item_labels if x[1]==t[1] ]))

It is slower, but reasonable over short lists.


The reason you've got an empty list is that g is a generator. You can only iterate over it once.

like image 72
Elazar Avatar answered Dec 31 '22 04:12

Elazar