Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove repetitve Consecutive words from string

I want to remove repetitive words from the string (only consecutive).

$str = 'abc,def,fgh,fgh,xna,fgh,xyz,xyz,xyz,tr,tr,xna';

My desired output string is:

abc,def,fgh,xna,fgh,xyz,tr,xna

I can get the result I want in php using this:

$ip = explode(',', $str);
$op = [];$last = null;
for($i=0;$i<count($ip);$i++){
    if ($last == $ip[$i]) {
        continue;
    }
    $op[]=$last=$ip[$i];
}
$ip = implode(',', $op);

But was looking for the regex approach. So far I have got closer with these two regexes:

$after = preg_replace('/(?:^|,)([^,]+)(?=.*,\1(?:,|$))/m', '', $str);
output : abc,def,fgh,xyz,tr,xna

$after = preg_replace('/([^,]+)(,[ ]*\1)+/m', '', $str);
output : abc,degh,fgh,xna,fgh,,,xna
like image 478
shreyas d Avatar asked Jul 21 '19 20:07

shreyas d


People also ask

Which of the following command will help to remove consecutive duplicates?

If you use 'uniq' command without any arguments, it will remove all consecutive duplicate lines and display only the unique lines.


3 Answers

You should use

preg_replace('~(?<![^,])([^,]+)(?:,\1)+(?![^,])~', '$1', $str)

See the regex demo

If there is a need to support any 0 or more whitespace chars between the commas and repetitive values, add \s* (0 or more whitespaces) pattern before \1.

Details

  • (?<![^,]) - start of string or any char but a comma
  • ([^,]+) - Group 1: any one or more chars other than a comma
  • (?:,\1)+ - one or more sequences of a comma and the value in Group 1
  • (?![^,]) - end of string or a char other than a comma.
like image 189
Wiktor Stribiżew Avatar answered Oct 11 '22 05:10

Wiktor Stribiżew


$after = preg_replace('/(?<=^|,)([^,]+)(,\s*\1)+/', '$1', $str);

P.S. You can get rid of \s* from the regexp above if there is no whitespace expecter after ,. I just looked at your [ ]* and figured you may have whitespace.

like image 28
5lava Avatar answered Oct 11 '22 03:10

5lava


Iterating with strtok, only glue pieces that are not like the last:

<?php

$str = 'abc,def,fgh,fgh,xna,fgh,xyz,xyz,xyz,tr,tr,xna';

$out = $last = strtok($str, ',');
while($current = strtok(','))
    if($current !== $last)
        $out .= ',' . ($last = $current);

echo $out;

Output:

abc,def,fgh,xna,fgh,xyz,tr,xna
like image 25
Progrock Avatar answered Oct 11 '22 04:10

Progrock