Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Python 3 string ordering depend on locale?

Do Python's str.__lt__ or sorted order characters based on their unicode index or by some locale-dependent collation rules?

like image 639
Aivar Avatar asked Oct 22 '14 10:10

Aivar


People also ask

How are strings ordered in Python?

Strings are sorted alphabetically, and numbers are sorted numerically. Note: You cannot sort a list that contains BOTH string values AND numeric values.

How does sort work in Python?

sort() method sorts the elements of a list in ascending or descending order using the default < comparisons operator between items. Use the key parameter to pass the function name to be used for comparison instead of the default < operator. Set the reverse parameter to True, to get the list in descending order.

How do you sort in ascending order in Python?

sorted() , with no additional arguments or parameters, is ordering the values in numbers in an ascending order, meaning smallest to largest. The original numbers variable is unchanged because sorted() provides sorted output and does not change the original value in place.

How do you sort data in Python?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.


1 Answers

No, string ordering does not take locale into account. It is based entirely on the Unicode codepoint sort order.

The locale module does provide you with a locale.strxform() function that can be used for locale-specific sorting:

import locale

sorted(list_of_strings, key=locale.strxfrm)

This tool is quite limited; for any serious collation task you probably want to use the PyICU library:

import PyICU

collator = PyICU.Collator.createInstance(PyICU.Locale(locale_spec))
sorted(list_of_strings, key=collator.getSortKey)
like image 103
Martijn Pieters Avatar answered Oct 25 '22 14:10

Martijn Pieters