Sorting a list of tuples (dictionary keys,values pairs where the key is a random string) is faster when I do not explicitly specify that the key should be used (edit: added operator.itemgetter(0) from comment by @Chepner and the key version is now faster!): <pre class="prettyprint"><code>import timeit setup =""" import random import string random.seed('slartibartfast') d={} for i in range(1000): d[''.join(random.choice(string.ascii_uppercase) for _ in range(16))] = 0 """ print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass', setup=setup).repeat(7, 1000)) print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass', setup=setup).repeat(7, 1000)) print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass', setup=setup).repeat(7, 1000)) </code></pre> Gives: <pre class="prettyprint"><code>0.575334150664 0.579534521128 0.523808984422 (the itemgetter version!) </code></pre> If however I create a custom object passing the <code>key=lambda x: x[0]</code> explicitly to <code>sorted</code> makes it faster: <pre class="prettyprint"><code>setup =""" import random import string random.seed('slartibartfast') d={} class A(object): def __init__(self): self.s = ''.join(random.choice(string.ascii_uppercase) for _ in range(16)) def __hash__(self): return hash(self.s) def __eq__(self, other): return self.s == other.s def __ne__(self, other): return self.s != other.s # def __cmp__(self, other): return cmp(self.s ,other.s) for i in range(1000): d[A()] = 0 """ print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass', setup=setup).repeat(3, 1000)) print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass', setup=setup).repeat(3, 1000)) print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass', setup=setup).repeat(3, 1000)) </code></pre> Gives: <pre class="prettyprint"><code>4.65625458083 1.87191002252 1.78853626684 </code></pre> Is this expected ? Seems like second element of the tuple is used in the second case but shouldn't the keys compare unequal ? Note: uncommenting the comparison method gives worse results but still the times are at one half: <pre class="prettyprint"><code>8.11941771831 5.29207000173 5.25420037046 </code></pre> As expected built in (address comparison) is faster. EDIT: here are the profiling results from my original code that triggered the question - without the key method: <pre class="prettyprint"><code> 12739 function calls in 0.007 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.007 0.007 <string>:1(<module>) 1 0.000 0.000 0.007 0.007 __init__.py:6527(_refreshOrder) 1 0.002 0.002 0.006 0.006 {sorted} 4050 0.003 0.000 0.004 0.000 bolt.py:1040(__cmp__) # here is the custom object 4050 0.001 0.000 0.001 0.000 {cmp} 4050 0.000 0.000 0.000 0.000 {isinstance} 1 0.000 0.000 0.000 0.000 {method 'sort' of 'list' objects} 291 0.000 0.000 0.000 0.000 __init__.py:6537(<lambda>) 291 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 bolt.py:1240(iteritems) 1 0.000 0.000 0.000 0.000 {method 'iteritems' of 'dict' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} </code></pre> and here are the results when I specify the key: <pre class="prettyprint"><code> 7027 function calls in 0.004 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.004 0.004 <string>:1(<module>) 1 0.000 0.000 0.004 0.004 __init__.py:6527(_refreshOrder) 1 0.001 0.001 0.003 0.003 {sorted} 2049 0.001 0.000 0.002 0.000 bolt.py:1040(__cmp__) 2049 0.000 0.000 0.000 0.000 {cmp} 2049 0.000 0.000 0.000 0.000 {isinstance} 1 0.000 0.000 0.000 0.000 {method 'sort' of 'list' objects} 291 0.000 0.000 0.000 0.000 __init__.py:6538(<lambda>) 291 0.000 0.000 0.000 0.000 __init__.py:6533(<lambda>) 291 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 bolt.py:1240(iteritems) 1 0.000 0.000 0.000 0.000 {method 'iteritems' of 'dict' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} </code></pre> Apparently it is the <code>__cmp__</code> and not the <code>__eq__</code> method that is called (edit cause that class defines <code>__cmp__</code> but not <code>__eq__</code>, see here for the order of resolution of equal and compare). In the code here <code>__eq__</code> method is indeed called (8605 times) as seen by adding debug prints (see the comments). So the difference is as stated in the answer by @chepner. The last thing I am not quite clear on is why are those tuple equality calls needed (IOW why we need to call eq and we don't call cmp directly). FINAL EDIT: I asked this last point here: Why in comparing python tuples of objects is __eq__ and then __cmp__ called? - turns out it's an optimization, tuple's comparison calls <code>__eq__</code> in the tuple elements, and only call cmp for not eq tuple elements. So this is now perfectly clear. I thought it called directly <code>__cmp__</code> so initially it seemed to me that specifying the key is just unneeded and after Chepner's answer I was still not getting where the equal calls come in. Gist: https://gist.github.com/Utumno/f3d25e0fe4bd0f43ceb9178a60181a53

There are two issues at play. <ol> <li>Comparing two values of builtin types (such as <code>int</code>) happens in C. Comparing two values of a class with an <code>__eq__</code> method happens in Python; repeatedly calling <code>__eq__</code> imposes a significant performance penalty.</li> <li>The function passed with <code>key</code> is called once per element, rather than once per comparison. This means that <code>lambda x: x[0]</code> is called once to build a list of <code>A</code> instances to be compared. Without <code>key</code>, you need to make O(n lg n) tuple comparisons, each of which requires a call to <code>A.__eq__</code> to compare the first element of each tuple.</li> </ol> The first explains why your first pair of results is under a second while the second takes several seconds. The second explains why using <code>key</code> is faster regardless of the values being compared.

Why is sorting a python list of tuples faster when I explicitly provide the key as the first element?

Tags:

python

sorting

timeit

tuples

python-2.7

Sorting a list of tuples (dictionary keys,values pairs where the key is a random string) is faster when I do not explicitly specify that the key should be used (edit: added operator.itemgetter(0) from comment by @Chepner and the key version is now faster!):

import timeit

setup ="""
import random
import string

random.seed('slartibartfast')
d={}
for i in range(1000):
    d[''.join(random.choice(string.ascii_uppercase) for _ in range(16))] = 0
"""
print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass',
        setup=setup).repeat(7, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass',
        setup=setup).repeat(7, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass',
        setup=setup).repeat(7, 1000))

Gives:

0.575334150664
0.579534521128
0.523808984422 (the itemgetter version!)

If however I create a custom object passing the key=lambda x: x[0] explicitly to sorted makes it faster:

setup ="""
import random
import string

random.seed('slartibartfast')
d={}

class A(object):
    def __init__(self):
        self.s = ''.join(random.choice(string.ascii_uppercase) for _ in
              range(16))
    def __hash__(self): return hash(self.s)
    def __eq__(self, other):
        return self.s == other.s
    def __ne__(self, other): return self.s != other.s
    # def __cmp__(self, other): return cmp(self.s ,other.s)

for i in range(1000):
    d[A()] = 0
"""
print min(timeit.Timer('for k,v in sorted(d.iteritems()): pass',
        setup=setup).repeat(3, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=lambda x: x[0]): pass',
        setup=setup).repeat(3, 1000))
print min(timeit.Timer('for k,v in sorted(d.iteritems(),key=operator.itemgetter(0)): pass',
        setup=setup).repeat(3, 1000))

Gives:

4.65625458083
1.87191002252
1.78853626684

Is this expected ? Seems like second element of the tuple is used in the second case but shouldn't the keys compare unequal ?

Note: uncommenting the comparison method gives worse results but still the times are at one half:

8.11941771831
5.29207000173
5.25420037046

As expected built in (address comparison) is faster.

EDIT: here are the profiling results from my original code that triggered the question - without the key method:

         12739 function calls in 0.007 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
        1    0.000    0.000    0.007    0.007 __init__.py:6527(_refreshOrder)
        1    0.002    0.002    0.006    0.006 {sorted}
     4050    0.003    0.000    0.004    0.000 bolt.py:1040(__cmp__) # here is the custom object
     4050    0.001    0.000    0.001    0.000 {cmp}
     4050    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'sort' of 'list' objects}
      291    0.000    0.000    0.000    0.000 __init__.py:6537(<lambda>)
      291    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 bolt.py:1240(iteritems)
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

and here are the results when I specify the key:

         7027 function calls in 0.004 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
        1    0.000    0.000    0.004    0.004 __init__.py:6527(_refreshOrder)
        1    0.001    0.001    0.003    0.003 {sorted}
     2049    0.001    0.000    0.002    0.000 bolt.py:1040(__cmp__)
     2049    0.000    0.000    0.000    0.000 {cmp}
     2049    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'sort' of 'list' objects}
      291    0.000    0.000    0.000    0.000 __init__.py:6538(<lambda>)
      291    0.000    0.000    0.000    0.000 __init__.py:6533(<lambda>)
      291    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 bolt.py:1240(iteritems)
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Apparently it is the __cmp__ and not the __eq__ method that is called (edit cause that class defines __cmp__ but not __eq__, see here for the order of resolution of equal and compare).

In the code here __eq__ method is indeed called (8605 times) as seen by adding debug prints (see the comments).

So the difference is as stated in the answer by @chepner. The last thing I am not quite clear on is why are those tuple equality calls needed (IOW why we need to call eq and we don't call cmp directly).

FINAL EDIT: I asked this last point here: Why in comparing python tuples of objects is __eq__ and then __cmp__ called? - turns out it's an optimization, tuple's comparison calls __eq__ in the tuple elements, and only call cmp for not eq tuple elements. So this is now perfectly clear. I thought it called directly __cmp__ so initially it seemed to me that specifying the key is just unneeded and after Chepner's answer I was still not getting where the equal calls come in.

Gist: https://gist.github.com/Utumno/f3d25e0fe4bd0f43ceb9178a60181a53

753

asked Dec 24 '15 16:12

Mr_and_Mrs_D

1 Answers

There are two issues at play.

Comparing two values of builtin types (such as int) happens in C. Comparing two values of a class with an __eq__ method happens in Python; repeatedly calling __eq__ imposes a significant performance penalty.
The function passed with key is called once per element, rather than once per comparison. This means that lambda x: x[0] is called once to build a list of A instances to be compared. Without key, you need to make O(n lg n) tuple comparisons, each of which requires a call to A.__eq__ to compare the first element of each tuple.

The first explains why your first pair of results is under a second while the second takes several seconds. The second explains why using key is faster regardless of the values being compared.

answered Oct 15 '22 13:10

chepner

Related questions
                            
                                Python igraph: delete vertices from a graph
                            
                                flask running in mod_wsgi cannot write to /tmp
                            
                                does nolearn/lasagne support python 3
                            
                                how to store ipython magic output into variable
                            
                                Django .aggregate() on .annotate()
                            
                                Python, sharing mysql connection in multiple functions - pass connection or cursor?
                            
                                How to perform discrete optimization of functions over matrices?
                            
                                Why should I discard half of what a FFT returns?
                            
                                Annoying generator bug
                            
                                How to encode Python 3 string using \u escape code?
                            
                                Internal Redirect in Flask
                            
                                Macro Recursion in Jinja2
                            
                                listcomp unable to access locals defined in code called by exec if nested in function
                            
                                PUT Request to REST API using Python
                            
                                Python, PEP-8, E122 continuation line missing indentation or outdented
                            
                                how do I make a 2.7 python context manager threadsafe
                            
                                Equivalent of Python's Counter in Ruby
                            
                                Python socket.error: [Errno 104] Connection reset by peer
                            
                                How to debug when using multiprocessing in pycharm
                            
                                Making a meaningful sentence from a given set of words [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With