Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would be a regex to replace/remove END where its not been preceded by an unended START?

What would be a regex (PHP) to replace/remove (using preg_replace()) END where its not been preceded by an unended START?

Here are a few examples to portray what I mean better:

Example 1:

Input:

sometext....END

Output:

sometext.... //because theres no START, therefore no need for the excess END

Example 2:

Input:

STARTsometext....END

Output:

STARTsometext....END //because its preceded by a START

Example 3:

Input:

STARTsometext....END.......END

Output:

STARTsometext....END....... //because the END is not preceded by a START

Hoping someone can help?

Thank You.

like image 260
Newbtophp Avatar asked Dec 28 '22 04:12

Newbtophp


1 Answers

Assuming you aren't looking for nested pairs, there is a simple solution to remore excess ENDs. Consider:

$str = preg_replace("/END|(START.*?END)/", "$1", $str);

This is a little backwards replacement, but it makes sense if you understand the order in which the engine works. First, the regex is made of two main parts: END|(). The alternations are tried from left to right, so if the engine sees an END in the input string, it will match it and move on to the next match (that is, look for END again).
The second part is a capturing group, which contains START.*?END - this will match an entire Start/End token if possible. Everything else will be skipped, until it finds another END or START.

Since we use $1 in the replace, which is the captured group, we only save the second token. Therefor, the only way for an END to survive is to get into the capturing group, by being the first one after a START.

For example, for the text END START 123 END abc END. The regex will find the following matches, and keep, skip or remove them accordingly:

  • END - Removed
  • (START 123 END) - Captured
  • a - Skip
  • b - Skip
  • c - Skip
  • END - Removed

Working example: http://ideone.com/suVYh

like image 177
Kobi Avatar answered Jan 12 '23 00:01

Kobi