Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort string array first on length then alphabetically in Python 3

How to sort an array in python firstly by the length of the words (longest to shortest), and then alphabetically?

Here is what I mean:

I have this list: WordsArray = ["Lorem", "ipsum", "dolor", "sit", "amet", "consectetur", "adipiscing", "elit", "sed", "do", "eiusmod", "tempor", "incididunt"]

I want to output this: ['consectetur', 'adipiscing', 'incididunt', 'eiusmod', 'tempor', 'dolor', 'ipsum', 'Lorem', 'amet', 'elit', 'sed', 'sit', 'do']

I can already sort alphabetically using print (sorted(WordsArray)):

['Lorem', 'adipiscing', 'amet', 'consectetur', 'do', 'dolor', 'eiusmod', 'elit', 'incididunt', 'ipsum', 'sed', 'sit', 'tempor']
like image 615
PortugalTheMan Avatar asked Dec 23 '22 13:12

PortugalTheMan


2 Answers

Firstly, using just sorted will not sort alphabetically, look at your output... I am pretty sure L does not come before a. What you are currently doing is a case-sensitive sort.

You can perform a case-insensitive sort by using a Key Function like so:

>>> words_list = ["Lorem", "ipsum", "dolor", "sit", "amet", "consectetur", "adipiscing", "elit", "sed", "do", "eiusmod", "tempor", "incididunt"]
>>> sorted(words_list, key=str.lower)
['adipiscing', 'amet', 'consectetur', 'do', 'dolor', 'eiusmod', 'elit', 'incididunt', 'ipsum', 'Lorem', 'sed', 'sit', 'tempor']

You can then modify the Key Function like below to sort first on length then alphabetically:

>>> def custom_key(str):
...   return -len(str), str.lower()
... 
>>> sorted(words_list, key=custom_key)
['consectetur', 'adipiscing', 'incididunt', 'eiusmod', 'tempor', 'dolor', 'ipsum', 'Lorem', 'amet', 'elit', 'sed', 'sit', 'do']
like image 118
Sash Sinha Avatar answered May 22 '23 05:05

Sash Sinha


You can use as key a tuple that specifies first the negative length of the string -len(x) and then x itself:

sorted(WordsArray, key=lambda x: (-len(x),x))

Since tuples are sorted by the first element and in case of ex aequo by the second element and so on, we thus first compare on the -len(x) of the two strings, so that means that the larger string is sorted first.

In case both strings have the same length, we compare on x, so alphabetically.

Mind that sorting two strings is case sensitive: Python sorts them lexicographically, but where the order is specified by the ord(..) of the first characters, etc. If you want to order alphabetically, you better convert upper case and lower case to the same case. A fast way to handle this is:

sorted(WordsArray, key=lambda x: (-len(x),x.lower()))

But this is not always correct: since for instance the est-zet in German is sometimes translate to ss, etc. In fact sorting alphabetically is a very hard problem in some languages. So in that case, you need to specify collation.

like image 32
Willem Van Onsem Avatar answered May 22 '23 03:05

Willem Van Onsem