Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript Regex Remove Specific Consecutive Duplicate Characters

I am trying to build a regex function that will remove any non alpha numeric characters and remove all duplicate characters e.g. this : aabcd*def%gGGhhhijkklmnoP\1223 would become this : abcddefgGhijklmnoPR3. I am able to remove the special characters easily but can't for the life of me work out how to remove the duplicate characters ? This is my current code for removing the special characters :

var oldString = aabcd*def%gGGhhhijkklmnoP\122
var filtered = oldStringt.replace(/[^\w\s]/gi, ""); 

How can I extend the above regex to check for duplicate characters and those duplicate characters separated by non-alphanumeric characters.

like image 431
jonnyhitek Avatar asked Oct 15 '11 21:10

jonnyhitek


3 Answers

The regex is /[^\w\s]|(.)\1/gi

Test here: http://jsfiddle.net/Cte94/

it uses the backreference to search for any character (.) followed by the same character \1

Unless by "check for duplicate characters" you meant that aaa => a

Then it's /[^\w\s]|(.)(?=\1)/gi

Test here: http://jsfiddle.net/Cte94/1/

Be aware that both regexes don't distinguish between case. A == a, so Aa is a repetition. If you don't want it, take away the i from /gi

like image 186
xanatos Avatar answered Oct 20 '22 15:10

xanatos


\1+ is the key

"aabcdd".replace(/(\w)\1+/g, function (str, match) {
    return match[0]
}); // abcd
like image 35
Joe Avatar answered Oct 20 '22 15:10

Joe


Non regex version:

var oldString = "aabcd*def%gGGhhhijkklmnoP\122";
var newString = "";

var len = oldString.length;
var c = oldString[0];
for ( var i = 1; i < len; ++i ) {
  if ( c != oldString[i] ) {
    newString += c;
  }
  c = oldString[i];
}
like image 2
hsz Avatar answered Oct 20 '22 15:10

hsz