I have a list of integers(or could be even strings), which I would like to sort by the frequency of occurrences in Python, for instance:
a = [1, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5]
Here the element 5
appears 4 times in the list, 4
appears 3 times. So the output sorted list would be :
result = [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
I tried using a.count()
, but it gives the number of occurrence of the element.
I would like to sort it. Any idea how to do it ?
Thanks
The frequency sort algorithm is used to output elements of an array in descending order of their frequencies. If two elements have the same frequencies, then the element that occurs first in the input is printed first.
Method 4: Count occurrences of an element in a list Using countof() Operator. countOf() is used for counting the number of occurrences of b in a. It counts the number of occurrences of value.
from collections import Counter
print [item for items, c in Counter(a).most_common() for item in [items] * c]
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
Or even better (efficient) implementation
from collections import Counter
from itertools import repeat, chain
print list(chain.from_iterable(repeat(i, c) for i,c in Counter(a).most_common()))
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
Or
from collections import Counter
print sorted(a, key=Counter(a).get, reverse=True)
# [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
If you prefer in-place sort
a.sort(key=Counter(a).get, reverse=True)
Using Python 3.3 and the built in sorted function, with the count as the key:
>>> a = [1,1,2,3,3,3,4,4,4,5,5,5,5]
>>> sorted(a,key=a.count)
[2, 1, 1, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5]
>>> sorted(a,key=a.count,reverse=True)
[5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
In [15]: a = [1,1,2,3,3,3,4,4,4,5,5,5,5]
In [16]: counts = collections.Counter(a)
In [17]: list(itertools.chain.from_iterable([[k for _ in range(counts[k])] for k in sorted(counts, key=counts.__getitem__, reverse=True)]))
Out[17]: [5, 5, 5, 5, 3, 3, 3, 4, 4, 4, 1, 1, 2]
Alternatively:
answer = []
for k in sorted(counts, key=counts.__getitem__, reverse=True):
answer.extend([k for _ in range(counts[k])])
Of course, [k for _ in range(counts[k])]
can be replaced with [k]*counts[k]
.
So line 17 becomes
list(itertools.chain.from_iterable([[k]*counts[k] for k in sorted(counts, key=counts.__getitem__, reverse=True)]))
If you happen to be using numpy already, or if using it is an option, here's another alternative:
In [309]: import numpy as np
In [310]: a = [1, 2, 3, 3, 1, 3, 5, 4, 4, 4, 5, 5, 5]
In [311]: vals, counts = np.unique(a, return_counts=True)
In [312]: order = np.argsort(counts)[::-1]
In [313]: np.repeat(vals[order], counts[order])
Out[313]: array([5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 1, 1, 2])
That result is a numpy array. If you want to end up with a Python list, call the array's tolist()
method:
In [314]: np.repeat(vals[order], counts[order]).tolist()
Out[314]: [5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 1, 1, 2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With