I work on an application that uses texts from different languages, so, for viewing or reporting purposes, some texts (strings) need to be sorted in a specific language.
Currently I have a workaround messing with the global locale settings, which is bad, and I don't want to put it in production:
default_locale = locale.getlocale(locale.LC_COLLATE)
def sort_strings(strings, locale_=None):
if locale_ is None:
return sorted(strings)
locale.setlocale(locale.LC_COLLATE, locale_)
sorted_strings = sorted(strings, cmp=locale.strcoll)
locale.setlocale(locale.LC_COLLATE, default_locale)
return sorted_strings
The official python locale documentation explicitly says that saving and restoring is a bad idea, but does not give any suggestions: http://docs.python.org/library/locale.html#background-details-hints-tips-and-caveats
In Python, there are two ways, sort() and sorted() , to sort lists ( list ) in ascending or descending order. If you want to sort strings ( str ) or tuples ( tuple ), use sorted() .
You can use Nested for loop with if statement to get the sort a list in Python without sort function. This is not the only way to do it, you can use your own logic to get it done.
The sorted() function returns a sorted list of the specified iterable object. You can specify ascending or descending order. Strings are sorted alphabetically, and numbers are sorted numerically.
You could use a PyICU's collator to avoid changing global settings:
import icu # PyICU
def sorted_strings(strings, locale=None):
if locale is None:
return sorted(strings)
collator = icu.Collator.createInstance(icu.Locale(locale))
return sorted(strings, key=collator.getSortKey)
Example:
>>> L = [u'sandwiches', u'angel delight', u'custard', u'éclairs', u'glühwein']
>>> sorted_strings(L)
['angel delight', 'custard', 'glühwein', 'sandwiches', 'éclairs']
>>> sorted_strings(L, 'en_US')
['angel delight', 'custard', 'éclairs', 'glühwein', 'sandwiches']
Disadvantage: dependency on PyICU library; the behavior is slightly different from locale.strcoll
.
I don't know how to get locale.strxfrm
function given a locale name without changing it globally. As a hack you could run your function in a different child process:
pool = multiprocessing.Pool()
# ...
pool.apply(locale_aware_sort, [strings, loc])
Disadvantage: might be slow, resource hungry
Using ordinary threading.Lock
won't work unless you can control every place where locale aware functions (they are not limited to locale
module e.g., re
) could be called from multiple threads.
You could compile your function using Cython to synchronize access using GIL. GIL will make sure that no other Python code can be executed while your function is running.
Disadvantage: not pure Python
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With