Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Regex: Matching a string between 2 colons

Tags:

java

regex

I'm trying to write a Java regex that will find all the strings between 2 :. If the string between the characters has whitespaces, line endings or tabs, it should be ignored. Empty strings are also ignored. _ are ok! The group can either include the enclosing : or not.

Here are a few tests and the expected groups:

"test :candidate: test" => ":candidate:"
"test :candidate: test:" => ":candidate:"
"test :candidate:_test:" => ":candidate:", ":_test:"
"test :candidate::test" => ":candidate:"
"test ::candidate: test" => ":candidate:"
"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:"
"test :candidate_:candidate: test" => ":candidate_:", ":candidate:"

I've tested a lot of regex and these ones almost work:

":(\\w+):"
":[^:]+:"

I still have a problem when the 2 groups "share" a colon:

"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:" // OK
"test :candidate_:candidate: test" => ":candidate_:" // ERROR! :(

It seems like the first group "consumes" the second colon and that the matcher can't find the second string I expected.

Can someone point me in the right direction to solve this problem? Can you also elaborate on why the matcher "consumes" the colon?

Thanks.

like image 212
Vincent Durmont Avatar asked Dec 25 '22 06:12

Vincent Durmont


1 Answers

Use a Positive Lookahead for capturing to get the overlapping matches.

(?=(:\\w+:))

Note: You can access your match result by refering to capturing group #1 ( Live Demo )

like image 198
hwnd Avatar answered Dec 28 '22 07:12

hwnd