I want to match a pattern with a string including pure numbers, such as '2324235235980980' with a pattern like as described below:
The pattern is '2-6-8-7-4', in which the pattern starts with 2, transit to 6, either self-loop at 6 or transit to 8, then it could go back and forth between 6 and 8, could self-loop at 8, or could transit to 7. And the same thing for 7. One more thing for 7 is 7-8-6-8-7 could happen. Finally, 7 could reach 4, once it reaches 4, the pattern is done. During the process, if it reaches out to other points, then it has to start with 2 again to be counted. I use
import re
re.findall(r'(2((6+8+)+)7)', test_string)
the output includes '2666686888668887', but when I add 4, I don't know the syntax to compile this. Has anyone an idea? Thanks a lot!
I think this is easier achieved than initially expected:
26[68]+?[687]+?4
2-followed-by-6-followed-by-6|8-followed-by-6|8|7-followed-by-4.
The only not so obvious part is to make the pattern lazy.
Here is an even better pattern:
\b26?([^7]6|8|[^6]7)+?4\b
2-followed-by-(not7)6|8|(not6)7-followed-by-4.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With