I've been playing for a bit with <code>startswith()</code> and I've discovered something interesting: <pre class="prettyprint"><code>>>> tup = ('1', '2', '3') >>> lis = ['1', '2', '3', '4'] >>> '1'.startswith(tup) True >>> '1'.startswith(lis) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: startswith first arg must be str or a tuple of str, not list </code></pre> Now, the error is obvious and casting the list into a tuple will work just fine as it did in the first place: <pre class="prettyprint"><code>>>> '1'.startswith(tuple(lis)) True </code></pre> Now, my question is: why the first argument must be str or a tuple of str prefixes, but not a list of str prefixes? AFAIK, the Python code for <code>startswith()</code> might look like this: <pre class="prettyprint"><code>def startswith(src, prefix): return src[:len(prefix)] == prefix </code></pre> But that just confuses me more, because even with it in mind, it still shouldn't make any difference whether is a list or tuple. What am I missing ?

There is technically no reason to accept other sequence types, no. The source code roughly does this: <pre class="prettyprint"><code>if isinstance(prefix, tuple): for substring in prefix: if not isinstance(substring, str): raise TypeError(...) return tailmatch(...) elif not isinstance(prefix, str): raise TypeError(...) return tailmatch(...) </code></pre> (where <code>tailmatch(...)</code> does the actual matching work). So yes, any iterable would do for that <code>for</code> loop. But, all the other string test APIs (as well as <code>isinstance()</code> and <code>issubclass()</code>) that take multiple values also only accept tuples, and this tells you as a user of the API that it is safe to assume that the value won't be mutated. You can't mutate a tuple but the method could in theory mutate the list. Also note that you usually test for a fixed number of prefixes or suffixes or classes (in the case of <code>isinstance()</code> and <code>issubclass()</code>); the implementation is not suited for a large number of elements. A tuple implies that you have a limited number of elements, while lists can be arbitrarily large. Next, if any iterable or sequence type would be acceptable, then that would include strings; a single string is also a sequence. Should then a single string argument be treated as separate characters, or as a single prefix? So in other words, it's a limitation to self-document that the sequence won't be mutated, is consistent with other APIs, it carries an implication of a limited number of items to test against, and removes ambiguity as to how a single string argument should be treated. Note that this was brought up before on the Python Ideas list; see this thread; Guido van Rossum's main argument there is that you either special case for single strings or for only accepting a tuple. He picked the latter and doesn't see a need to change this.

How does str.startswith really work?

Tags:

I've been playing for a bit with startswith() and I've discovered something interesting:

>>> tup = ('1', '2', '3') >>> lis = ['1', '2', '3', '4'] >>> '1'.startswith(tup) True >>> '1'.startswith(lis) Traceback (most recent call last):   File "<stdin>", line 1, in <module> TypeError: startswith first arg must be str or a tuple of str, not list

Now, the error is obvious and casting the list into a tuple will work just fine as it did in the first place:

>>> '1'.startswith(tuple(lis)) True

Now, my question is: why the first argument must be str or a tuple of str prefixes, but not a list of str prefixes?

AFAIK, the Python code for startswith() might look like this:

def startswith(src, prefix):     return src[:len(prefix)] == prefix

But that just confuses me more, because even with it in mind, it still shouldn't make any difference whether is a list or tuple. What am I missing ?

239

asked Jul 15 '17 11:07

Cajuu'

1 Answers

There is technically no reason to accept other sequence types, no. The source code roughly does this:

if isinstance(prefix, tuple):     for substring in prefix:         if not isinstance(substring, str):             raise TypeError(...)         return tailmatch(...) elif not isinstance(prefix, str):     raise TypeError(...) return tailmatch(...)

(where tailmatch(...) does the actual matching work).

So yes, any iterable would do for that for loop. But, all the other string test APIs (as well as isinstance() and issubclass()) that take multiple values also only accept tuples, and this tells you as a user of the API that it is safe to assume that the value won't be mutated. You can't mutate a tuple but the method could in theory mutate the list.

Also note that you usually test for a fixed number of prefixes or suffixes or classes (in the case of isinstance() and issubclass()); the implementation is not suited for a large number of elements. A tuple implies that you have a limited number of elements, while lists can be arbitrarily large.

Next, if any iterable or sequence type would be acceptable, then that would include strings; a single string is also a sequence. Should then a single string argument be treated as separate characters, or as a single prefix?

So in other words, it's a limitation to self-document that the sequence won't be mutated, is consistent with other APIs, it carries an implication of a limited number of items to test against, and removes ambiguity as to how a single string argument should be treated.

Note that this was brought up before on the Python Ideas list; see this thread; Guido van Rossum's main argument there is that you either special case for single strings or for only accepting a tuple. He picked the latter and doesn't see a need to change this.

answered Sep 20 '22 20:09

Martijn Pieters

Related questions
                            
                                Kotlin parcelable and arrayList of parcelables
                            
                                Download files from url to local device in .Net Core
                            
                                How to stop kubectl proxy
                            
                                iOS 11 UITableView delete rows animation bug
                            
                                How to read web.config file in .Net Core app
                            
                                HikariPool-1 - jdbcUrl is required with driverClassName
                            
                                How can I set default build target for Cargo?
                            
                                Running a single line of code in PyCharm
                            
                                How to append new row to dataframe in pandas?
                            
                                Upgrading Rails: What am I to do with new_framework_defaults file?
                            
                                Material Button with icon on the right
                            
                                JPA/SpringBoot Repository for database view (not table)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With