I have a character string 'aabaacaba'
. Starting from left, I am trying to get substrings of all sizes >=2, which appear later in the string. For instance, aa
appears again in the string and so is the case with ab
.
I wrote following regex code:
re.findall(r'([a-z]{2,})(?:[a-z]*)(?:\1)', 'aabaacaba')
and I get ['aa'] as answer. Regular expression misses ab pattern. I think this is because of overlapping characters. Please suggest a solution, so that the expression could be fixed. Thank you.
You can simply use DEF as your regexp. To identify strings that don't contain it, simply return the strings that don't match the above expression.
Use re.search() to extract a substring matching a regular expression pattern. Specify the regular expression pattern as the first parameter and the target string as the second parameter. \d matches a digit character, and + matches one or more repetitions of the preceding pattern.
Use the substring() method to get the substring before a specific character, e.g. const before = str. substring(0, str. indexOf('_')); . The substring method will return a new string containing the part of the string before the specified character.
You can use look-ahead assertion which does not consume matched string:
>>> re.findall(r'(?=([a-z]{2,})(?=.*\1))', 'aabaacaba')
['aa', 'aba', 'ba']
NOTE: aba
matched instead of ab
. (trying to match as long as possible)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With