Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace each instance between two characters

I have the following data below where {n} represents a placeholder.

{n}{n}A{n}{n}A{n}
{n}A{n}{n}{n}{n}A
{n}{n}A{n}A{n}{n}
{n}{n}{n}A{n}A{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}

I would like to replace each instance of the placeholder between two A characters with for example the letter C. I wrote the following regex for this and I'm using preg_replace function.

$str = preg_replace('~(?<=A)(\{n\})*(?=A)~', 'C', $str);

The problem is that it replaces all instances between two A's with one C. How could I fix my regex or the preg_replace call to replace each individual instance of the placeholders with C?

This should be my output.

{n}{n}ACCA{n}
{n}ACCCCA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}

But currently it outputs this.

{n}{n}ACA{n}
{n}ACA
{n}{n}ACA{n}{n}
{n}{n}{n}ACA{n}B
{n}A{n}{n}B{n}{n}
A{n}B{n}{n}{n}{n}
like image 709
RMartin Avatar asked Feb 13 '15 03:02

RMartin


3 Answers

You can solve the problem by anchoring with \G.

$str = preg_replace('~(?:\G(?!\A)|({n})*A(?=(?1)++A))\K{n}~', 'C', $str);

The \G feature is an anchor that can match at one of two positions; the start of the string position or the position at the end of the last match. The \K escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included.

To reduce the amount of backtracking, you could use a more complex expression:

$str = preg_replace('~\G(?!\A)(?:{n}
                      |A(?:[^A]*A)+?((?=(?:{n})++A)\K{n}
                      |(*COMMIT)(*F)))
                      |[^A]*A(?:[^A]*A)*?(?1)~x', 'C', $str);
like image 115
hwnd Avatar answered Oct 09 '22 14:10

hwnd


The somewhat more verbose but easier to follow solution is to use the initial expression to break the text up into groups; then apply the individual transformation inside each group:

$text = preg_replace_callback('~(?<=A)(?:\{n\})*(?=A)~', function($match) {
    // simple replacement inside
    return str_replace('{n}', 'C', $match[0]);
}, $text);

I've made a small tweak to the expression to get rid of the memory capture, which is unnecessary, by using (?:...).

like image 42
Ja͢ck Avatar answered Oct 09 '22 14:10

Ja͢ck


(?<=A){n}(?=(?:{n})*A)|\G(?!^){n}

You can try this. Replace by C. Here you have to use \G to assert position at the end of the previous match or the start of the string for the first match.

So that you can match after your first match. See demo.

https://regex101.com/r/wU4xK1/7

Here first you match {n} which has A behind it and A after it which can have {n} in between. After the capture, you use \G to reset to end of previous match and subsequently keep on replacing {n} found.

$re = "/(?<=A){n}(?=(?:{n})*A)|\\G(?!^){n}/";
$str = "{n}{n}A{n}{n}A{n}\n{n}A{n}{n}{n}{n}A\n{n}{n}A{n}A{n}{n}\n{n}{n}{n}A{n}A{n}B\n{n}A{n}{n}B{n}{n}\nA{n}B{n}{n}{n}{n}";
$subst = "C";

$result = preg_replace($re, $subst, $str);
like image 37
vks Avatar answered Oct 09 '22 12:10

vks