Case-insensitive string startswith in Python

People also ask

Is Startswith case-sensitive Python?

The startswith() search is case-sensitive, as shown below. The start and end parameters limit the checking of a prefix in a string as indexes.

Is Startswith () a valid string method in Python?

The startswith() method returns True if a string starts with the specified prefix(string). If not, it returns False .

What does Startswith mean in Python?

The startswith() method returns True if the string starts with the specified value, otherwise False.

Is there a Startswith function in Python?

The startswith() string method checks whether a string starts with a particular substring. If the string starts with a specified substring, the startswith() method returns True; otherwise, the function returns False.

You could use a regular expression as follows:

In [33]: bool(re.match('he', 'Hello', re.I))
Out[33]: True 

In [34]: bool(re.match('el', 'Hello', re.I))
Out[34]: False

On a 2000-character string this is about 20x times faster than lower():

In [38]: s = 'A' * 2000

In [39]: %timeit s.lower().startswith('he')
10000 loops, best of 3: 41.3 us per loop

In [40]: %timeit bool(re.match('el', s, re.I))
100000 loops, best of 3: 2.06 us per loop

If you are matching the same prefix repeatedly, pre-compiling the regex can make a large difference:

In [41]: p = re.compile('he', re.I)

In [42]: %timeit p.match(s)
1000000 loops, best of 3: 351 ns per loop

For short prefixes, slicing the prefix out of the string before converting it to lowercase could be even faster:

In [43]: %timeit s[:2].lower() == 'he'
1000000 loops, best of 3: 287 ns per loop

Relative timings of these approaches will of course depend on the length of the prefix. On my machine the breakeven point seems to be about six characters, which is when the pre-compiled regex becomes the fastest method.

In my experiments, checking every character separately could be even faster:

In [44]: %timeit (s[0] == 'h' or s[0] == 'H') and (s[1] == 'e' or s[1] == 'E')
1000000 loops, best of 3: 189 ns per loop

However, this method only works for prefixes that are known when you're writing the code, and doesn't lend itself to longer prefixes.

How about this:

prefix = 'he'
if myVeryLongStr[:len(prefix)].lower() == prefix.lower()

Another simple solution is to pass a tuple to startswith() for all the cases needed to match e.g. .startswith(('case1', 'case2', ..)).

For example:

>>> 'Hello'.startswith(('He', 'HE'))
True
>>> 'HEllo'.startswith(('He', 'HE'))
True
>>>

None of the given answers is actually correct, as soon as you consider anything outside the ASCII range.

For example in a case insensitive comparison ß should be considered equal to SS if you're following Unicode's case mapping rules.

To get correct results the easiest solution is to install Python's regex module which follows the standard:

import re
import regex
# enable new improved engine instead of backwards compatible v0
regex.DEFAULT_VERSION = regex.VERSION1 

print(re.match('ß', 'SS', re.IGNORECASE)) # none
print(regex.match('ß', 'SS', regex.IGNORECASE)) # matches

Depending on the performance of .lower(), if prefix was small enough it might be faster to check equality multiple times:

s =  'A' * 2000
prefix = 'he'
ch0 = s[0] 
ch1 = s[1]
substr = ch0 == 'h' or ch0 == 'H' and ch1 == 'e' or ch1 == 'E'

Timing (using the same string as NPE):

>>> timeit.timeit("ch0 = s[0]; ch1 = s[1]; ch0 == 'h' or ch0 == 'H' and ch1 == 'e' or ch1 == 'E'", "s = 'A' * 2000")
0.2509511683747405

= 0.25 us per loop

Compared to existing method:

>>> timeit.timeit("s.lower().startswith('he')", "s = 'A' * 2000", number=10000)
0.6162763703208611

= 61.63 us per loop

(This is horrible, of course, but if the code is extremely performance critical then it might be worth it)

Related questions
                            
                                Fill cells with colors using openpyxl?
                            
                                Pandas DataFrame Add column to index without resetting
                            
                                How to I display why some tests where skipped while using py.test?
                            
                                Running an Excel macro via Python?
                            
                                Why isn't .ico file defined when setting window's icon?
                            
                                How to update the image of a Tkinter Label widget?
                            
                                How do I add a title and axis labels to Seaborn Heatmap?
                            
                                how to add a coroutine to a running asyncio loop?
                            
                                How can I check for unused import in many Python files?
                            
                                Suppressing scientific notation in pandas?
                            
                                How to make a custom activation function with only Python in Tensorflow?
                            
                                summing two columns in a pandas dataframe
                            
                                Select multiple columns by labels in pandas
                            
                                Vim autocomplete for Python
                            
                                Python calling method in class
                            
                                How to call an external program in python and retrieve the output and return code?
                            
                                How to find newest file with .MP3 extension in directory?
                            
                                Get first row of dataframe in Python Pandas based on criteria
                            
                                Parsing a JSON string which was loaded from a CSV using Pandas
                            
                                Python: No csv.close()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Case-insensitive string startswith in Python

Tags:

performance

python

string

case-insensitive

startswith

People also ask

Recent Activity

Donate For Us