Python Collection Counter.most_common(n)
method returns the top n elements with their counts. However, if the counts for two elements is the same, how can I return the result sorted by alphabetical order?
For example: for a string like: BBBAAACCD
, for the "2-most common" elements, I want the result to be for specified n = 2
:
[('A', 3), ('B', 3), ('C', 2)]
and NOT:
[('B', 3), ('A', 3), ('C', 2)]
Notice that although A
and B
have the same frequency, A
comes before B
in the resultant list since it comes before B
in alphabetical order.
[('A', 3), ('B', 3), ('C', 2)]
How can I achieve that?
Python sorted() Function The sorted() function returns a sorted list of the specified iterable object. You can specify ascending or descending order. Strings are sorted alphabetically, and numbers are sorted numerically.
For example with a list of strings, specifying key=len (the built in len() function) sorts the strings by length, from shortest to longest. The sort calls len() for each string to get the list of proxy length values, and then sorts with those proxy values.
Although this question is already a bit old i'd like to suggest a very simple solution to the problem which just involves sorting the input of Counter() before creating the Counter object itself. If you then call most_common(n) you will get the top n entries sorted in alphabetical order.
from collections import Counter
char_counter = Counter(sorted('ccccbbbbdaef'))
for char in char_counter.most_common(3):
print(*char)
resulting in the output:
b 4
c 4
a 1
There are two issues here:
None of the solutions thus far address the first issue. You can use a heap queue with the itertools
unique_everseen
recipe (also available in 3rd party libraries such as toolz.unique
) to calculate the nth largest count.
Then use sorted
with a custom key.
from collections import Counter
from heapq import nlargest
from toolz import unique
x = 'BBBAAACCD'
c = Counter(x)
n = 2
nth_largest = nlargest(n, unique(c.values()))[-1]
def sort_key(x):
return -x[1], x[0]
gen = ((k, v) for k, v in c.items() if v >= nth_largest)
res = sorted(gen, key=sort_key)
[('A', 3), ('B', 3), ('C', 2)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With