I have to find all strings which are made of only letters 'a' and 'b' and every instance of 'a' is immediately followed by 'b' and immediately preceded by 'b'.
For example:
mystring = 'bab babab babbab ab baba aba xyz'
Then my regex should return:
['bab' 'babab' 'babbab']
(In string 'ab' - 'a' is not preceded by 'b'. Similarly for 'aba' and 'xyz' is not made of only 'a','b'.)
I used lookahead for this and wrote this regex:
re.findall(r'((?<=b)a(?=b))',mystring)
But this only returns me all instances of 'a' which are followed/preceded by 'b' like:
['a','a','a','a']
But I need whole words. How can I find whole words using regex? I tried to modify my regex with various options, but nothing seems to work. How can this be done?
You can use following regex :
>>> re.findall(r'\b(?:b+a)+b+\b',mystring)
['bab', 'babab', 'babbab']
Debuggex Demo
As you can see from preceding diagram this regex will match any combination of ba
(which b
can presents more than one time), which produce words that every a
precede by b
then the whole of the string can be followed by one or more b
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With