I think I have some misunderstanding about how a positive Lookbehind works in Regex, here is an example:
12,2 g this is fully random
89 g random string 2
0,6 oz random stuff
1 really random stuff
Let's say I want to match everything after the measuring unit, so I want "this is fully random", "random string 2", "random stuff" and really "random stuff".
In order to do that I tried the following pattern:
(?<=(\d(,\d)?) (g|oz)?).*
But as "?" means 0 or 1, it seems that the pattern prioritizes 0 over 1 in that case - So I get:
But the measuring unit has to stay "optional" as it won't necessary be in the string (cf fourth instance)...
Any idea on how to deal with that issue? Thanks!
It would be easier to look at the positions that it matches to see what happens. The assertion (?<=(\d(,\d)?) (g|oz)?)
is true at a position where what is directly to the left is (\d(,\d)?)
and optional (g|oz)?
The pattern goes from left to right, and the assertion is true at multiple places. But at the first place it encounters, it matches .*
meaning 0+ times any char and will match until the end of the line.
See the positions on regex101
What you might do instead is match the digit part and make the space followed by g
or oz
optional and use a capturing group for the second part.
\d+(?:,\d+)?(?: g| oz)? (.*)
Regex demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With