I'm trying to write a Java regex that will find all the strings between 2 :
. If the string between the characters has whitespaces, line endings or tabs, it should be ignored. Empty strings are also ignored. _
are ok! The group can either include the enclosing :
or not.
Here are a few tests and the expected groups:
"test :candidate: test" => ":candidate:"
"test :candidate: test:" => ":candidate:"
"test :candidate:_test:" => ":candidate:", ":_test:"
"test :candidate::test" => ":candidate:"
"test ::candidate: test" => ":candidate:"
"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:"
"test :candidate_:candidate: test" => ":candidate_:", ":candidate:"
I've tested a lot of regex and these ones almost work:
":(\\w+):"
":[^:]+:"
I still have a problem when the 2 groups "share" a colon:
"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:" // OK
"test :candidate_:candidate: test" => ":candidate_:" // ERROR! :(
It seems like the first group "consumes" the second colon and that the matcher can't find the second string I expected.
Can someone point me in the right direction to solve this problem? Can you also elaborate on why the matcher "consumes" the colon?
Thanks.
Use a Positive Lookahead for capturing to get the overlapping matches.
(?=(:\\w+:))
Note: You can access your match result by refering to capturing group #1
( Live Demo )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With