I have the following data below where {n}
represents a placeholder.
{n}{n}A{n}{n}A{n}
{n}A{n}{n}{n}{n}A
{n}{n}A{n}A{n}{n}
{n}{n}{n}A{n}A{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}
I would like to replace each instance of the placeholder between two A characters with for example the letter C
. I wrote the following regex for this and I'm using preg_replace
function.
$str = preg_replace('~(?<=A)(\{n\})*(?=A)~', 'C', $str);
The problem is that it replaces all instances between two A's with one C
. How could I fix my regex or the preg_replace
call to replace each individual instance of the placeholders with C
?
This should be my output.
{n}{n}ACCA{n}
{n}ACCCCA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}
But currently it outputs this.
{n}{n}ACA{n}
{n}ACA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}
You can solve the problem by anchoring with \G
.
$str = preg_replace('~(?:\G(?!\A)|({n})*A(?=(?1)++A))\K{n}~', 'C', $str);
The \G
feature is an anchor that can match at one of two positions; the start of the string position or the position at the end of the last match. The \K
escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included.
To reduce the amount of backtracking, you could use a more complex expression:
$str = preg_replace('~\G(?!\A)(?:{n}
|A(?:[^A]*A)+?((?=(?:{n})++A)\K{n}
|(*COMMIT)(*F)))
|[^A]*A(?:[^A]*A)*?(?1)~x', 'C', $str);
The somewhat more verbose but easier to follow solution is to use the initial expression to break the text up into groups; then apply the individual transformation inside each group:
$text = preg_replace_callback('~(?<=A)(?:\{n\})*(?=A)~', function($match) {
// simple replacement inside
return str_replace('{n}', 'C', $match[0]);
}, $text);
I've made a small tweak to the expression to get rid of the memory capture, which is unnecessary, by using (?:...)
.
(?<=A){n}(?=(?:{n})*A)|\G(?!^){n}
You can try this. Replace by C
. Here you have to use \G
to assert position at the end of the previous match or the start of the string for the first match.
So that you can match after your first match. See demo.
https://regex101.com/r/wU4xK1/7
Here first you match {n}
which has A
behind it and A
after it which can have {n}
in between. After the capture, you use \G
to reset to end of previous match and subsequently keep on replacing {n}
found.
$re = "/(?<=A){n}(?=(?:{n})*A)|\\G(?!^){n}/";
$str = "{n}{n}A{n}{n}A{n}\n{n}A{n}{n}{n}{n}A\n{n}{n}A{n}A{n}{n}\n{n}{n}{n}A{n}A{n}B\n{n}A{n}{n}B{n}{n}\nA{n}B{n}{n}{n}{n}";
$subst = "C";
$result = preg_replace($re, $subst, $str);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With