Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you remove repeated characters in a string

Tags:

string

c#

regex

I have a website which allows users to comment on photos. Of course, users leave comments like:

'OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG!!!!!!!!!!!!!!!'

or

'YOU SUCCCCCCCCCCCCCCCCCKKKKKKKKKKKKKKKKKK'

You get it.

Basically, I want to shorten those comments by removing at least most of those excess repeated characters. I'm sure there's a way to do it with Regex..i just can't figure it out.

Any ideas?

like image 390
Ed B Avatar asked Dec 13 '10 14:12

Ed B


2 Answers

Do you specifically want to shorten the strings in the code, or would it be enough to simply fail validation and present the form to the user again with a validation error? Something like "Too many repeated characters."

If the latter is acceptable, @"(\w)\1{2}" should match characters of 3 or more (interpreted as "repeated" two or more times).

Edit: As @Piskvor pointed out, this will match on exactly 3 characters. It works fine for matching, but not for replacing. His version, @"(\w)\1{2,}", would work better for replacing. However, I'd like to point out that I think replacing wouldn't be the best practice here. Better to just have the form fail validation than to try to scrub the text being submitted, because there likely will be edge cases where you turn otherwise readable (even if unreasonable) text into nonsense.

like image 174
David Avatar answered Oct 19 '22 08:10

David


Keeping in mind that the English language uses double letters often you probably don't want to blindly eliminate them. Here is a regex that will get rid of anything beyond a double.

Regex r = new Regex("(.)(?<=\\1\\1\\1)", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);

var x = r.Replace("YOU SUCCCCCCCCCCCCCCCCCKKKKKKKKKKKKKKKKKK", String.Empty);
// x = "YOU SUCCKK"

var y = r.Replace("OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG!!!!!!!!!!!!!!!", String.Empty);
// y = "OMGG!!"
like image 39
Ryan Pedersen Avatar answered Oct 19 '22 08:10

Ryan Pedersen