I have a character string <code>'aabaacaba'</code>. Starting from left, I am trying to get substrings of all sizes >=2, which appear later in the string. For instance, <code>aa</code> appears again in the string and so is the case with <code>ab</code>. I wrote following regex code: <pre class="prettyprint"><code>re.findall(r'([a-z]{2,})(?:[a-z]*)(?:\1)', 'aabaacaba') </code></pre> and I get ['aa'] as answer. Regular expression misses ab pattern. I think this is because of overlapping characters. Please suggest a solution, so that the expression could be fixed. Thank you.

You can use look-ahead assertion which does not consume matched string: <pre class="prettyprint"><code>>>> re.findall(r'(?=([a-z]{2,})(?=.*\1))', 'aabaacaba') ['aa', 'aba', 'ba'] </code></pre> NOTE: <code>aba</code> matched instead of <code>ab</code>. (trying to match as long as possible)

regex string and substring

Tags:

python

regex

substr

I have a character string 'aabaacaba'. Starting from left, I am trying to get substrings of all sizes >=2, which appear later in the string. For instance, aa appears again in the string and so is the case with ab.

I wrote following regex code:

re.findall(r'([a-z]{2,})(?:[a-z]*)(?:\1)', 'aabaacaba')

and I get ['aa'] as answer. Regular expression misses ab pattern. I think this is because of overlapping characters. Please suggest a solution, so that the expression could be fixed. Thank you.

954

asked May 14 '17 02:05

Sumit

1 Answers

You can use look-ahead assertion which does not consume matched string:

>>> re.findall(r'(?=([a-z]{2,})(?=.*\1))', 'aabaacaba')
['aa', 'aba', 'ba']

NOTE: aba matched instead of ab. (trying to match as long as possible)

answered Sep 17 '22 13:09

falsetru

Related questions
                            
                                pupil detection in OpenCV & Python
                            
                                Is there any way to use aiohttp client with socks proxy?
                            
                                Access Python QObject from QML fails to convert on second call
                            
                                Lazy iterators (generators) with asyncio
                            
                                Is there a proper way to set compound greek letters as a symbol in SymPy?
                            
                                How do I use a keyword as a variable name?
                            
                                Import Error: No module named numpy Anaconda
                            
                                Groupby Aggregate method is returning NaN always
                            
                                Panda rolling window percentile rank
                            
                                The requested address is not valid in its context error
                            
                                Can't load Flask config from parent directory
                            
                                How to add x-axis labels to every plot in a seaborn figure-level plot
                            
                                Impute missing data, while forcing correlation coefficient to remain the same
                            
                                Parallel version of t-SNE
                            
                                Python Jupyter Notebook: Specify cell execution order
                            
                                Get only certain fields of related object in Django
                            
                                Pandas adding Time column to Date index
                            
                                How to shift several rows in a pandas DataFrame?
                            
                                What is the point of the permission infrastructure in Pyramid?
                            
                                Using __prepare__ for an Enum ... what's the catch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With