Why is slice assignment faster than `list.insert`?

Tags:

Inspired by this nice answer,

Here's a benchmark:

import timeit

def test1():
    a = [1,2,3]
    a.insert(0,1)

def test2():
    a = [1,2,3]
    a[0:0]=[1]

print (timeit.timeit('test1()','from __main__ import test1'))
print (timeit.timeit('test2()','from __main__ import test2'))

For me, test2 is sligtly faster (~10%). Why is that the case? I would expect it to be slower since:

slice assignment must be able to accept iterables of any length and therefore must be more general.
in slice assignment, we need to create a new list on the right hand side just to get it to work.

Can anybody help me understand this?

(using python 2.7 on OS-X 10.5.8)

805

asked Sep 21 '12 20:09

mgilson

1 Answers

Your first test case has to call the method insert on the list a, whereas all the operations in test2 are handled directly in byte code. Note the CALL_FUNCTION in the disassembly of test1 below. Calling functions is moderately expensive in Python: certainly expensive enough to account for a few percent difference in run time.

>>> import dis
>>> dis.dis(test1)
  2           0 LOAD_CONST               1 (1)
              3 LOAD_CONST               2 (2)
              6 LOAD_CONST               3 (3)
              9 BUILD_LIST               3
             12 STORE_FAST               0 (a)

  3          15 LOAD_FAST                0 (a)
             18 LOAD_ATTR                0 (insert)
             21 LOAD_CONST               4 (0)
             24 LOAD_CONST               1 (1)
             27 CALL_FUNCTION            2
             30 POP_TOP             
             31 LOAD_CONST               0 (None)
             34 RETURN_VALUE        
>>> dis.dis(test2)
  2           0 LOAD_CONST               1 (1)
              3 LOAD_CONST               2 (2)
              6 LOAD_CONST               3 (3)
              9 BUILD_LIST               3
             12 STORE_FAST               0 (a)

  3          15 LOAD_CONST               1 (1)
             18 BUILD_LIST               1
             21 LOAD_FAST                0 (a)
             24 LOAD_CONST               4 (0)
             27 LOAD_CONST               4 (0)
             30 STORE_SLICE+3       
             31 LOAD_CONST               0 (None)
             34 RETURN_VALUE

Bad explanation

I posted this first, but after consideration I think it is not correct. The difference I describe here should only make a noticeable difference when there is a lot of data to be moved, which isn't the case in the test here. And even with a lot of data the difference is only a couple of percent:

import timeit

def test1():
    a = range(10000000)
    a.insert(1,1)

def test2():
    a = range(10000000)
    a[1:1]=[1]

>>> timeit.timeit(test1, number=10)
6.008707046508789
>>> timeit.timeit(test2, number=10)
5.861173868179321

The method list.insert is implemented by the function ins1 in listobject.c. You'll see that it copies the item references for the tail of the list one by one:

for (i = n; --i >= where; )
    items[i+1] = items[i];

On the other hand slice assignment is implemented by the function list_ass_slice, which calls memmove:

memmove(&item[ihigh+d], &item[ihigh],
        (k - ihigh)*sizeof(PyObject *));

So I think the answer to your question is that the C library function memmove is better optimized than the simple loop. See here for the glibc implementation of memmove: I believe that when called from list_ass_slice it eventually ends up calling _wordcopy_bwd_aligned which you can see is heavily hand-optimized.

132

answered Dec 08 '22 23:12

Gareth Rees

Related questions
                            
                                Java Execute Shell Script Bundled within JAR [duplicate]
                            
                                Profiling Node.JS (code) execution time in Linux
                            
                                SQLite insert or ignore and return original _rowid_
                            
                                Canvas.clipPath(Path) not clipping as expected
                            
                                syslog.syslog vs SysLogHandler
                            
                                How to use phase control of inlining in haskell?
                            
                                How to get specific github notifications like pull requests or assigned issues?
                            
                                Display: table-cell on new row
                            
                                Get IPython tab completion for ipdb
                            
                                Catch submit before HTML5 validation
                            
                                Allow notification to be cancelled after calling stopForeground(false)
                            
                                What is the X-FB-DEBUG header returned from Facebook?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With