Replace each instance between two characters

Question

I have the following data below where {n} represents a placeholder.

{n}{n}A{n}{n}A{n}
{n}A{n}{n}{n}{n}A
{n}{n}A{n}A{n}{n}
{n}{n}{n}A{n}A{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}

I would like to replace each instance of the placeholder between two A characters with for example the letter C. I wrote the following regex for this and I'm using preg_replace function.

$str = preg_replace('~(?<=A)(\{n\})*(?=A)~', 'C', $str);

The problem is that it replaces all instances between two A's with one C. How could I fix my regex or the preg_replace call to replace each individual instance of the placeholders with C?

This should be my output.

{n}{n}ACCA{n}
{n}ACCCCA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}

But currently it outputs this.

{n}{n}ACA{n}
{n}ACA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}

hwnd · Accepted Answer

You can solve the problem by anchoring with \G.

$str = preg_replace('~(?:\G(?!\A)|({n})*A(?=(?1)++A))\K{n}~', 'C', $str);

The \G feature is an anchor that can match at one of two positions; the start of the string position or the position at the end of the last match. The \K escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included.

To reduce the amount of backtracking, you could use a more complex expression:

$str = preg_replace('~\G(?!\A)(?:{n}
                      |A(?:[^A]*A)+?((?=(?:{n})++A)\K{n}
                      |(*COMMIT)(*F)))
                      |[^A]*A(?:[^A]*A)*?(?1)~x', 'C', $str);

Ja͢ck · Answer

The somewhat more verbose but easier to follow solution is to use the initial expression to break the text up into groups; then apply the individual transformation inside each group:

$text = preg_replace_callback('~(?<=A)(?:\{n\})*(?=A)~', function($match) {
    // simple replacement inside
    return str_replace('{n}', 'C', $match[0]);
}, $text);

I've made a small tweak to the expression to get rid of the memory capture, which is unnecessary, by using (?:...).

vks · Answer

(?<=A){n}(?=(?:{n})*A)|\G(?!^){n}

You can try this. Replace by C. Here you have to use \G to assert position at the end of the previous match or the start of the string for the first match.

So that you can match after your first match. See demo.

https://regex101.com/r/wU4xK1/7

Here first you match {n} which has A behind it and A after it which can have {n} in between. After the capture, you use \G to reset to end of previous match and subsequently keep on replacing {n} found.

$re = "/(?<=A){n}(?=(?:{n})*A)|\G(?!^){n}/";
$str = "{n}{n}A{n}{n}A{n}
{n}A{n}{n}{n}{n}A
{n}{n}A{n}A{n}{n}
{n}{n}{n}A{n}A{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}";
$subst = "C";

$result = preg_replace($re, $subst, $str);

Replace each instance between two characters

Tags:

regex

php

preg-replace

RMartin

3 Answers

hwnd

Ja͢ck

vks

Recent Activity

Donate For Us

Replace each instance between two characters

Tags:

regex

php

preg-replace

RMartin

3 Answers

hwnd

Ja͢ck

vks

Related questions

Recent Activity

Donate For Us