Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrange elements with same count in alphabetical order

Python Collection Counter.most_common(n) method returns the top n elements with their counts. However, if the counts for two elements is the same, how can I return the result sorted by alphabetical order?

For example: for a string like: BBBAAACCD, for the "2-most common" elements, I want the result to be for specified n = 2:

[('A', 3), ('B', 3), ('C', 2)]

and NOT:

[('B', 3), ('A', 3), ('C', 2)]

Notice that although A and B have the same frequency, A comes before B in the resultant list since it comes before B in alphabetical order.

[('A', 3), ('B', 3), ('C', 2)]

How can I achieve that?

like image 330
stfd1123581321 Avatar asked Apr 18 '17 05:04

stfd1123581321


People also ask

How do you arrange a list of elements in alphabetical order in Python?

Python sorted() Function The sorted() function returns a sorted list of the specified iterable object. You can specify ascending or descending order. Strings are sorted alphabetically, and numbers are sorted numerically.

How do you sort a list by string length?

For example with a list of strings, specifying key=len (the built in len() function) sorts the strings by length, from shortest to longest. The sort calls len() for each string to get the list of proxy length values, and then sorts with those proxy values.


2 Answers

Although this question is already a bit old i'd like to suggest a very simple solution to the problem which just involves sorting the input of Counter() before creating the Counter object itself. If you then call most_common(n) you will get the top n entries sorted in alphabetical order.

from collections import Counter

char_counter = Counter(sorted('ccccbbbbdaef'))
for char in char_counter.most_common(3):
  print(*char)

resulting in the output:

b 4
c 4
a 1
like image 83
DJSchaffner Avatar answered Oct 02 '22 21:10

DJSchaffner


There are two issues here:

  1. Include duplicates when considering top n most common values excluding duplicates.
  2. For any duplicates, order alphabetically.

None of the solutions thus far address the first issue. You can use a heap queue with the itertools unique_everseen recipe (also available in 3rd party libraries such as toolz.unique) to calculate the nth largest count.

Then use sorted with a custom key.

from collections import Counter
from heapq import nlargest
from toolz import unique

x = 'BBBAAACCD'

c = Counter(x)
n = 2
nth_largest = nlargest(n, unique(c.values()))[-1]

def sort_key(x):
    return -x[1], x[0]

gen = ((k, v) for k, v in c.items() if v >= nth_largest)
res = sorted(gen, key=sort_key)

[('A', 3), ('B', 3), ('C', 2)]
like image 20
jpp Avatar answered Oct 02 '22 19:10

jpp