Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Consolidate repeating pattern

Tags:

regex

php

I am working on a script that develops certain strings of alphanumeric characters, separated by a dash -. I need to test the string to see if there are any sets of characters (the characters that lie in between the dashes) that are the same. If they are, I need to consolidate them. The repeating chars would always occur at the front in my case.

Examples:

KRS-KRS-454-L
would become:
KRS-454-L

DERP-DERP-545-P
would become:
DERP-545-P
like image 962
kylex Avatar asked Jul 18 '11 14:07

kylex


2 Answers

<?php
$s = 'KRS-KRS-454-L';
echo preg_replace('/^(\w+)-(?=\1)/', '', $s);
?>
// KRS-454-L

This uses a positive lookahead (?=...) to check for repeated strings.

Note that \w also contains the underscore. If you want to limit to alphanumeric characters only, use [a-zA-Z0-9].

Also, I've anchored with ^ as you've mentioned: "The repeating chars would always occur at the front [...]"

like image 160
mhyfritz Avatar answered Nov 19 '22 14:11

mhyfritz


Try the pattern:

/([a-z]+)(?:-\1)*(.*)/i

and replace it with:

$1$2

A demo:

$tests = array(
  'KRS-KRS-454-L',
  'DERP-DERP-DERP-545-P',
  'OKAY-666-A'
);

foreach ($tests as $t) {
  echo preg_replace('/([a-z]+)(?:-\1)*(.*)/i', '$1$2', $t) . "\n";
}

produces:

KRS-454-L
DERP-545-P
OKAY-666-A

A quick explanation:

([a-z]+)  # group the first "word" in match group 1
(?:-\1)*  # match a hyphen followed by what was matched in 
          # group 1, and repeat it zero or more times
(.*)      # match the rest of the input and store it in group 2

the replacement string $1$2 are replaced by what was matched by group 1 and group 2 in the pattern above.

like image 20
Bart Kiers Avatar answered Nov 19 '22 14:11

Bart Kiers