Been fiddling with this for hours...
I'm trying to parse error messages of this form:
[error] C:\Me\MyPath\myFile.scala:18:22: not found: value getaa
I can do this fine with the following regex:
\[(error|warn)\]\s+(.+):(\d+):(?:\d+:)\s+(.+)$
Which correctly produces groups:
error
C:\Me\MyPath\myFile.scala
18
not found: value getaa
But to make this robust, I need to make the 22: part optional (since some versions of the scala compiler don't output column number). In other words, it needs to produce the same groups as above for this string too:
[error] C:\Me\MyPath\myFile.scala:18: not found: value getaa
I've tried putting a question mark after the optional group, but that doesn't work - it messes up the original groups. I assume there's some stuff about lazy vs greedy that I'm not understanding. Here is a working sample on regex101. Thanks for any help.
You need to add two question marks:
\[(error|warn)\]\s+(.+?):(\d+):(?:\d+:)?\s+(.+)$
^ ^
See a regex demo
The .+? will match any 1+ chars other than line break chars as few as possible, and will thus match up to the first occurrence of the subpatterns to follow. The second ? will make (?:\d+:) group optional.
Full pattern details
\[ - a [(error|warn) - one of the two substrings (error or warn)\] - or just ] - a single ] char\s+ - 1+ whitespaces(.+?) - any 1+ chars other than line break chars, as few as possible, up to the first...: - a colon(\d+) - Group 2: one or more digits: - a colon(?:\d+:)? - a non-capturing group matching 1+ digits and a colon after them 1 or 0 times\s+ - 1+ whitespaces(.+) - Group 3: the rest of the line$ - end of string (note that it is not necessary here since .+ is a greedy subpattern)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With