Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex optional lookahead

Tags:

regex

I want a regular expression to match all of these:

  1. startabcend
  2. startdef
  3. blahstartghiend
  4. blahstartjklendsomething

and to return abc, def, ghi and jkl respectively.

I have this the following which works for case 1 and 3 but am having trouble making the lookahead optional.

(?<=start).*(?=end.*)

Edit:

Hmm. Bad example. In reality, the bit in the middle is not numeric, but is preceeded by a certain set of characters and optionally succeeded by it. I have updated the inputs and outputs as requested and added a 4th example in response to someones question.

like image 695
Paul Hiles Avatar asked Sep 09 '11 11:09

Paul Hiles


3 Answers

If you're able to use lookahead,

(?<=start).*?(?=(?:end|$))

as suggested by stema below is probably the simplest way to get the entire pattern to match what you want.

Alternatively, if you're able to use capturing groups, you should just do that instead:

start(.*?)(?:end)?$

and then just get the value from the first capture group.

like image 74
Amber Avatar answered Nov 02 '22 00:11

Amber


Maybe like this:

(?<=start).*?(?=(?:end|$))

This will match till "start" and "end" or till the end of line, additionally the quantifier has to be non greedy (.*?)

See it here on Regexr

Extended the example on Regexr to not only work with digits.

like image 18
stema Avatar answered Nov 02 '22 00:11

stema


An optional lookahead doesn't make sense:

If it's optional then it's ok if it matches, but it's also ok if it doesn't match. And since a lookahead does not extend the match it has absolutely no effect.

So the syntax for an optional lookahead is the empty string.

like image 4
Joachim Sauer Avatar answered Nov 02 '22 01:11

Joachim Sauer