I've a list of lists containing string represented numbers:
nums = [['1','3'],['2','2'],['1','2'],['0','2'],['11','2']]
I need to sort these numerically ascending by the first then second entries without modify the string representation of the numbers in the original list. Also, want to avoid creating another 2nd copy of the list with everything mapped to integers explicitly--imagine this is a huge list.
Both sort()
and sorted()
works nicely with tuples and lists, so with a lambda key, I can do the following:
>>> sorted(nums, key=lambda n: (int(n[0]),int(n[1]))
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]
Happy days...
However, I've seen a number of postings about sorting being faster using the operator.itemgetter()
as key function over using a lambda. Without getting into a discussion of the validity of these claims, does anyone if it's possible apply a conversion from string to integer for comparision when using operator.itemgetter()
:
The following obviously fails, due to the strings being compared as strings, not numbers:
>>> sorted(nums, key=operator.itemgetter(0,1))
[['0', '2'], ['1', '2'], ['1', '3'], ['11', '2'], ['2', '2']]
operator.itemgetter
is fast not because it does something special in sort
, but because it is entirely written in c, and does not involve calling a python function.
So what you're looking for is a C function that does what you want - itemgetter
is a red herring.
In python 2, you can avoid calling pure-python functions with key=functools.partial(map, int)
. This won't work in python 3, because map
no longer returns a list or tuple. This also might not be any faster than your solution.
There are ways, for example using iteration_utilities.chained
1 and functools.partial
:
>>> import operator import itemgetter
>>> from iteration_utilities import chained
>>> from functools import partial
>>> itemgetter_int = chained(operator.itemgetter(0, 1), partial(map, int), tuple)
>>> sorted(nums, key=itemgetter_int)
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]
It works but it's definetly slower than using a lambda
or custom defined function.
If you really need speed you could cythonize the lambda
-function (or write it in C by hand) but if you just need it in one place just use a throw-away lambda
. Especially if it's in sorted
because it has O(nlog(n))
comparisons so the O(n)
function calls probably don't contribute much to the overall execution time.
1: This is a function in a 3rd party extension module that I authored. It needs to be seperatly installed, for example via conda
or pip
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With