I am working on a script that develops certain strings of alphanumeric characters, separated by a dash -
. I need to test the string to see if there are any sets of characters (the characters that lie in between the dashes) that are the same. If they are, I need to consolidate them. The repeating chars would always occur at the front in my case.
Examples:
KRS-KRS-454-L
would become:
KRS-454-L
DERP-DERP-545-P
would become:
DERP-545-P
<?php
$s = 'KRS-KRS-454-L';
echo preg_replace('/^(\w+)-(?=\1)/', '', $s);
?>
// KRS-454-L
This uses a positive lookahead (?=...)
to check for repeated strings.
Note that \w
also contains the underscore. If you want to limit to alphanumeric characters only, use [a-zA-Z0-9]
.
Also, I've anchored with ^
as you've mentioned: "The repeating chars would always occur at the front [...]"
Try the pattern:
/([a-z]+)(?:-\1)*(.*)/i
and replace it with:
$1$2
A demo:
$tests = array(
'KRS-KRS-454-L',
'DERP-DERP-DERP-545-P',
'OKAY-666-A'
);
foreach ($tests as $t) {
echo preg_replace('/([a-z]+)(?:-\1)*(.*)/i', '$1$2', $t) . "\n";
}
produces:
KRS-454-L
DERP-545-P
OKAY-666-A
A quick explanation:
([a-z]+) # group the first "word" in match group 1
(?:-\1)* # match a hyphen followed by what was matched in
# group 1, and repeat it zero or more times
(.*) # match the rest of the input and store it in group 2
the replacement string $1$2
are replaced by what was matched by group 1 and group 2 in the pattern above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With