Why is the implementation of <code>startwith</code> slower than slicing? <pre class="prettyprint"><code>In [1]: x = 'foobar' In [2]: y = 'foo' In [3]: %timeit x.startswith(y) 1000000 loops, best of 3: 321 ns per loop In [4]: %timeit x[:3] == y 10000000 loops, best of 3: 164 ns per loop </code></pre> Surprisingly, even including calculation for the length, slicing still appears significantly faster: <pre class="prettyprint"><code>In [5]: %timeit x[:len(y)] == y 1000000 loops, best of 3: 251 ns per loop </code></pre> Note: the first part of this behaviour is noted in Python for Data Analysis (Chapter 3), but no explanation for it is offered. . If helpful: here is the C code for <code>startswith</code>; and here is the output of <code>dis.dis</code>: <pre class="prettyprint"><code>In [6]: import dis In [7]: dis_it = lambda x: dis.dis(compile(x, '<none>', 'eval')) In [8]: dis_it('x[:3]==y') 1 0 LOAD_NAME 0 (x) 3 LOAD_CONST 0 (3) 6 SLICE+2 7 LOAD_NAME 1 (y) 10 COMPARE_OP 2 (==) 13 RETURN_VALUE In [9]: dis_it('x.startswith(y)') 1 0 LOAD_NAME 0 (x) 3 LOAD_ATTR 1 (startswith) 6 LOAD_NAME 2 (y) 9 CALL_FUNCTION 1 12 RETURN_VALUE </code></pre>

The comparison isn't fair since you're only measuring the case where <code>startswith</code> returns <code>True</code>. <pre class="prettyprint"><code>>>> x = 'foobar' >>> y = 'fool' >>> %timeit x.startswith(y) 1000000 loops, best of 3: 221 ns per loop >>> %timeit x[:3] == y # note: length mismatch 10000000 loops, best of 3: 122 ns per loop >>> %timeit x[:4] == y 10000000 loops, best of 3: 158 ns per loop >>> %timeit x[:len(y)] == y 1000000 loops, best of 3: 210 ns per loop >>> sw = x.startswith >>> %timeit sw(y) 10000000 loops, best of 3: 176 ns per loop </code></pre> Also, for much longer strings, <code>startswith</code> is a lot faster: <pre class="prettyprint"><code>>>> import random >>> import string >>> x = '%030x' % random.randrange(256**10000) >>> len(x) 20000 >>> y = r[:4000] >>> %timeit x.startswith(y) 1000000 loops, best of 3: 211 ns per loop >>> %timeit x[:len(y)] == y 1000000 loops, best of 3: 469 ns per loop >>> sw = x.startswith >>> %timeit sw(y) 10000000 loops, best of 3: 168 ns per loop </code></pre> This is still true when there's no match. <pre class="prettyprint"><code># change last character of y >>> y = y[:-1] + chr((ord(y[-1]) + 1) % 256) >>> %timeit x.startswith(y) 1000000 loops, best of 3: 210 ns per loop >>> %timeit x[:len(y)] == y 1000000 loops, best of 3: 470 ns per loop >>> %timeit sw(y) 10000000 loops, best of 3: 168 ns per loop # change first character of y >>> y = chr((ord(y[0]) + 1) % 256) + y[1:] >>> %timeit x.startswith(y) 1000000 loops, best of 3: 210 ns per loop >>> %timeit x[:len(y)] == y 1000000 loops, best of 3: 442 ns per loop >>> %timeit sw(y) 10000000 loops, best of 3: 168 ns per loop </code></pre> So, <code>startswith</code> is probably slower for short strings because it's optimized for long ones. (Trick to get random strings taken from this answer.)

<code>startswith</code> is more complex than slicing... <pre class="prettyprint"><code>2924 result = _string_tailmatch(self, 2925 PyTuple_GET_ITEM(subobj, i), 2926 start, end, -1); </code></pre> This isn't a simple character compare loop for needle in beginning of haystack that's happening. We're looking at a for loop that is iterating through a vector/tuple (subobj) and calling another function (<code>_string_tailmatch</code>) on it. Multiple function calls have overhead with regards to the stack, argument sanity checks etc... <code>startswith</code> is a library function while the slicing appears to be built into the language. <pre class="prettyprint"><code>2919 if (!stringlib_parse_args_finds("startswith", args, &subobj, &start, &end)) 2920 return NULL; </code></pre>

To quote the docs, <code>startswith</code> does more you might think: <blockquote> <code>str.startswith(prefix[, start[, end]])</code> <blockquote> Return <code>True</code> if string starts with the prefix, otherwise return <code>False</code>. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position. </blockquote> </blockquote>

Why is startswith slower than slicing

Tags:

python

startswith

Why is the implementation of startwith slower than slicing?

In [1]: x = 'foobar'

In [2]: y = 'foo'

In [3]: %timeit x.startswith(y)
1000000 loops, best of 3: 321 ns per loop

In [4]: %timeit x[:3] == y
10000000 loops, best of 3: 164 ns per loop

Surprisingly, even including calculation for the length, slicing still appears significantly faster:

In [5]: %timeit x[:len(y)] == y
1000000 loops, best of 3: 251 ns per loop

Note: the first part of this behaviour is noted in Python for Data Analysis (Chapter 3), but no explanation for it is offered.

If helpful: here is the C code for startswith; and here is the output of dis.dis:

In [6]: import dis

In [7]: dis_it = lambda x: dis.dis(compile(x, '<none>', 'eval'))

In [8]: dis_it('x[:3]==y')
  1           0 LOAD_NAME                0 (x)
              3 LOAD_CONST               0 (3)
              6 SLICE+2             
              7 LOAD_NAME                1 (y)
             10 COMPARE_OP               2 (==)
             13 RETURN_VALUE        

In [9]: dis_it('x.startswith(y)')
  1           0 LOAD_NAME                0 (x)
              3 LOAD_ATTR                1 (startswith)
              6 LOAD_NAME                2 (y)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE

670

asked Nov 07 '12 13:11

Andy Hayden

4 Answers

Some of the performance difference can be explained by taking into account the time it takes the . operator to do its thing:

>>> x = 'foobar'
>>> y = 'foo'
>>> sw = x.startswith
>>> %timeit x.startswith(y)
1000000 loops, best of 3: 316 ns per loop
>>> %timeit sw(y)
1000000 loops, best of 3: 267 ns per loop
>>> %timeit x[:3] == y
10000000 loops, best of 3: 151 ns per loop

Another portion of the difference can be explained by the fact that startswith is a function, and even no-op function calls take a bit of time:

>>> def f():
...     pass
... 
>>> %timeit f()
10000000 loops, best of 3: 105 ns per loop

This does not totally explain the difference, since the version using slicing and len calls a function and is still faster (compare to sw(y) above -- 267 ns):

>>> %timeit x[:len(y)] == y
1000000 loops, best of 3: 213 ns per loop

My only guess here is that maybe Python optimizes lookup time for built-in functions, or that len calls are heavily optimized (which is probably true). It might be possible to test that with a custom len func. Or possibly this is where the differences identified by LastCoder kick in. Note also larsmans' results, which indicate that startswith is actually faster for longer strings. The whole line of reasoning above applies only to those cases where the overhead I'm talking about actually matters.

132

answered Oct 22 '22 02:10

senderle

The comparison isn't fair since you're only measuring the case where startswith returns True.

>>> x = 'foobar'
>>> y = 'fool'
>>> %timeit x.startswith(y)
1000000 loops, best of 3: 221 ns per loop
>>> %timeit x[:3] == y  # note: length mismatch
10000000 loops, best of 3: 122 ns per loop
>>> %timeit x[:4] == y
10000000 loops, best of 3: 158 ns per loop
>>> %timeit x[:len(y)] == y
1000000 loops, best of 3: 210 ns per loop
>>> sw = x.startswith
>>> %timeit sw(y)
10000000 loops, best of 3: 176 ns per loop

Also, for much longer strings, startswith is a lot faster:

>>> import random
>>> import string
>>> x = '%030x' % random.randrange(256**10000)
>>> len(x)
20000
>>> y = r[:4000]
>>> %timeit x.startswith(y)
1000000 loops, best of 3: 211 ns per loop
>>> %timeit x[:len(y)] == y
1000000 loops, best of 3: 469 ns per loop
>>> sw = x.startswith
>>> %timeit sw(y)
10000000 loops, best of 3: 168 ns per loop

This is still true when there's no match.

# change last character of y
>>> y = y[:-1] + chr((ord(y[-1]) + 1) % 256)
>>> %timeit x.startswith(y)
1000000 loops, best of 3: 210 ns per loop
>>> %timeit x[:len(y)] == y
1000000 loops, best of 3: 470 ns per loop
>>> %timeit sw(y)
10000000 loops, best of 3: 168 ns per loop
# change first character of y
>>> y = chr((ord(y[0]) + 1) % 256) + y[1:]
>>> %timeit x.startswith(y)
1000000 loops, best of 3: 210 ns per loop
>>> %timeit x[:len(y)] == y
1000000 loops, best of 3: 442 ns per loop
>>> %timeit sw(y)
10000000 loops, best of 3: 168 ns per loop

So, startswith is probably slower for short strings because it's optimized for long ones.

(Trick to get random strings taken from this answer.)

answered Oct 22 '22 02:10

Fred Foo

startswith is more complex than slicing...

2924 result = _string_tailmatch(self,
2925 PyTuple_GET_ITEM(subobj, i),
2926 start, end, -1);

This isn't a simple character compare loop for needle in beginning of haystack that's happening. We're looking at a for loop that is iterating through a vector/tuple (subobj) and calling another function (_string_tailmatch) on it. Multiple function calls have overhead with regards to the stack, argument sanity checks etc...

startswith is a library function while the slicing appears to be built into the language.

2919 if (!stringlib_parse_args_finds("startswith", args, &subobj, &start, &end))
2920 return NULL;

answered Oct 22 '22 01:10

Louis Ricci

To quote the docs, startswith does more you might think:

str.startswith(prefix[, start[, end]])

Return True if string starts with the prefix, otherwise return False. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

answered Oct 22 '22 01:10

Eric

Related questions
                            
                                for or while loop to do something n times
                            
                                How to get the current Python interpreter path from inside a Python script? [duplicate]
                            
                                Should a return statement have parentheses?
                            
                                Scikit-learn's LabelBinarizer vs. OneHotEncoder
                            
                                Does the SVM in sklearn support incremental (online) learning?
                            
                                SQLite Performance Benchmark -- why is :memory: so slow...only 1.5X as fast as disk?
                            
                                Computing diffs within groups of a dataframe
                            
                                Custom loss function in Keras
                            
                                Python: next() function
                            
                                Resource usage of google Go vs Python and Java on Appengine
                            
                                Time Series Decomposition function in Python
                            
                                Global error handler for any exception
                            
                                What is the difference between __init__.py and __main__.py? [duplicate]
                            
                                Is there an R equivalent of the pythonic "if __name__ == "__main__": main()"?
                            
                                Python: How to show matplotlib in flask [duplicate]
                            
                                Using Numpy Vectorize on Functions that Return Vectors
                            
                                Why is variable1 += variable2 much faster than variable1 = variable1 + variable2?
                            
                                How to rearrange array based upon index array
                            
                                Using Merge on a column and Index in Pandas
                            
                                Returning multiple values from pandas apply on a DataFrame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With