Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How parse text with Regex using optional groups as the following preferred java syntax

I want to make regex as group optional

using

https://regex101.com/

The text is as the following:

start to proceed task TaskId = id Account = [email protected] Type = value1 Source = source_value SubSource = subSource_value

The optional groups are Source,SubSource all the rest are mandatories

I've tried to use as the following , but didn't success to use optional

Regex:

 start to proceed task\s*TaskId\s*=\s*(.*)\s*Account\s*=\s*(.*)\s*Type\s*=\s*(.*)\s*Source\s*=\s*(.*)\s*SubSource\s*=\s*(.*) 

OUTPUT:

Group 1.    31-36   `id `
Group 2.    46-57   `[email protected] `
Group 3.    64-71   `value1 `
Group 4.    80-93   `source_value `
Group 5.    105-120 `subSource_value`

But when I remove either Source or SubSource or both Source = source_value SubSource = subSource_value from the text , no output is shown , my purpuse is to have : (depends on what removed)

Group 1.    31-36   `id `
Group 2.    46-57   `[email protected] `
Group 3.    64-71   `value1 ` 
like image 414
VitalyT Avatar asked Feb 06 '26 02:02

VitalyT


1 Answers

Here is a working script and pattern:

String line = "start to proceed task TaskId = id Account = [email protected] Type = value1 Source = source_value SubSource = subSource_value";
String pattern = "start to proceed task\\s+TaskId\\s*=\\s*(.*?)\\s+Account\\s*=\\s*(.*?)\\s+Type\\s*=\\s*(.*?)(?:\\s+Source\\s*=\\s*(.*?))?\\s+(?:SubSource\\s*=\\s*(.*))?";

Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
if (m.find()) {
    System.out.println("Group 1: " + m.group(1) );
    System.out.println("Group 2: " + m.group(2) );
    System.out.println("Group 3: " + m.group(3) );
    System.out.println("Group 4: " + m.group(4) );
    System.out.println("Group 5: " + m.group(5) );
}

Group 1: id
Group 2: [email protected]
Group 3: value1
Group 4: source_value
Group 5: subSource_value

Demo

The crux of the changes I made include making the capture groups lazy (.*?). Also, I made the entire pattern for the source and sub source optional, e.g.

(?:\s+Source\s*=\s*(.*?))?

Notice that the surrounding group begins with ?:, which tells the regex engine not to capture this. So only your original (.*?) group might be captured, assuming the text has it.

In order to get the pattern to work, I needed to assume \s+ instead of \s* in certain places.

like image 138
Tim Biegeleisen Avatar answered Feb 09 '26 10:02

Tim Biegeleisen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!