Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx - problem with multiline input

Tags:

java

regex

I have a String with multiline content and want to select a multiline region, preferably using a regular expression (just because I'm trying to understand Java RegEx at the moment).

Consider the input like:

Line 1
abc START def
Line 2
Line 3
gh END jklm
Line 4

Assuming START and END are unique and the start/end markers for the region, I'd like to create a pattern/matcher to get the result:

 def
Line 2
Line 3
gh 

My current attempt is

Pattern p = Pattern.compile("START(.*)END");
Matcher m = p.matcher(input);
if (m.find())
  System.out.println(m.group(1));

But the result is

gh

So m.start() seems to point at the beginning of the line that contains the 'end marker'. I tried to add Pattern.MULTILINE to the compile call but that (alone) didn't change anything.

Where is my mistake?

like image 524
Andreas Dolk Avatar asked Dec 22 '22 23:12

Andreas Dolk


1 Answers

You want Pattern.DOTALL, so . matches newline characters. MULTILINE addresses a different issue, the ^ and $ anchors.

Pattern p = Pattern.compile("START(.*)END", Pattern.DOTALL);
like image 59
Matthew Flaschen Avatar answered Jan 03 '23 10:01

Matthew Flaschen