Why does searching for larger strings in a `reversed` string take more time than slice reversing?

Tags:

As I read here, reversing a string using the reversed function is more efficient than slice notation,string[::-1]. But when I tried it myself, I observed different results.

First, I tried to make a very big string. And then I tried to check how much time it takes to check whether a string exists in the large one or not. This is what I did:

In [1]: large = "abcdefgijklmnopqrstuvwxyz1234567890!@#$%^&*()_=+0}{QWERT"                                                                           

In [2]: large = 1000*large                                                                                                                           

In [3]: large = 1000*large  

In [5]: len(large)                                                                                                                                   
Out[5]: 56000000

In [6]: %time "a" in large[::-1]                                                                                                                     
CPU times: user 63 ms, sys: 43.4 ms, total: 106 ms
Wall time: 106 ms
Out[6]: True

In [7]: %time "a" in reversed(large)                                                                                                                 
CPU times: user 11 µs, sys: 1 µs, total: 12 µs
Wall time: 17.6 µs
Out[7]: True

If I check the existence of only 1 char in large, reversed is much faster, but when I attempt it for bigger strings, the result changes:


In [8]: %time "ab" in large[::-1]                                                                                                                    
CPU times: user 99.2 ms, sys: 44.1 ms, total: 143 ms
Wall time: 143 ms
Out[8]: False

In [9]: %time "ab" in reversed(large)                                                                                                                
CPU times: user 1.73 s, sys: 4.48 ms, total: 1.73 s
Wall time: 1.74 s
Out[9]: False
In [10]: %time "abc" in large[::-1]                                                                                                                  
CPU times: user 125 ms, sys: 20 ms, total: 145 ms
Wall time: 145 ms
Out[10]: False

In [11]: %time "abc" in reversed(large)                                                                                                              
CPU times: user 1.72 s, sys: 6.52 ms, total: 1.73 s
Wall time: 1.74 s
Out[11]: False

What's happening under the hood?

319

asked Jul 11 '21 11:07

Amir reza Riahi

1 Answers

The two are not the same, and may produce different boolean outcomes. For instance:

s = "ab"
print("ba" in s[::-1])  # True
print("ba" in reversed(s))  # False

The in operator on a string has different behaviour: it looks for a substring match. The in operator on a iterator will try to find a match on individually iterated values, i.e. single characters in this case.

So you cannot really compare these like that.

As to why it is slower to get a False outcome: the iterator will create separate strings for each iterated character to which then the needle string ("ba") is compared from scratch.

In the string version, there is an optimised search algorithm to find a substring in a larger string, which is one operation for Python. The efficient string search algorithm is implemented in lower level code, like C.

answered Oct 11 '22 18:10

trincot

Related questions
                            
                                Is there any way to use SageMath on macOS Big Sur?
                            
                                Finding relationships between values based on their name in Python with Panda
                            
                                Pandas how to get rows with consecutive dates and sales more than 1000?
                            
                                TypeError: Could not build a TypeSpec with type KerasTensor
                            
                                How to programmatically open an application by name on macOS?
                            
                                How to get real and imaginary parts of a column in a Pandas dataframe?
                            
                                When can you use a name before it's defined?
                            
                                Finding best matched word from large Vocalist
                            
                                plotly express plots have faded colors
                            
                                What is the "nomkl" Python package used for?
                            
                                Document Layout Analysis for text extraction
                            
                                Pandas normalise by column on groupby
                            
                                Python/pip process are killed in virtualenv (Apple M1 chip)
                            
                                KeyError caused by a wrong attribute assignment within the network?
                            
                                Exporting a Geopandas dataframe to a zipped shapefile directly
                            
                                Problem with combining fastapi with plotly.dash and adding token dependency as auth
                            
                                Matplotlib directory not found while using Pyinstaller to create exe from py files
                            
                                Only keep df column values that contain a string from list of string
                            
                                WARNING: xcodeproj is not installed or is not configured properly
                            
                                psycopg2.errors.ActiveSqlTransaction: CREATE TABLESPACE cannot run inside a transaction block

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does searching for larger strings in a `reversed` string take more time than slice reversing?

Tags:

performance

python

reverse

Amir reza Riahi

People also ask

1 Answers

trincot

Recent Activity

Donate For Us