Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular expression to replace two (or more) consecutive characters by only one?

Tags:

In java, which regular expression can be used to replace these, for example:

before: aaabbb after: ab

before: 14442345 after: 142345

thanks!

like image 651
Alex. S. Avatar asked Sep 19 '08 22:09

Alex. S.


People also ask

What is the regular expression matching one or more specific characters?

The character + in a regular expression means "match the preceding character one or more times". For example A+ matches one or more of character A. The plus character, used in a regular expression, is called a Kleene plus .

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9.

How do you replace consecutive characters in a string in python?

You can try a regular expression like (.)\ 1+ , i.e. "something, then more of the same something", and replace it with \1 , i.e. "that first something". >>> import re >>> re.

What does '*' represent in regular expression?

*$ means - match, from beginning to end, any character that appears zero or more times. Basically, that means - match everything from start to end of the string.


2 Answers

In perl

s/(.)\1+/$1/g; 

Does the trick, I assume if java has perl compatible regexps it should work too.

Edit: Here is what it means

s {     (.)  # match any charater ( and capture it )     \1   # if it is followed by itself      +    # One or more times }{$1}gx;  # And replace the whole things by the first captured character (with g modifier to replace all occurences) 

Edit: As others have pointed out, the syntax in Java would become

original.replaceAll("(.)\\1+", "$1"); 

remember to escape the \1

like image 133
Pat Avatar answered Oct 10 '22 03:10

Pat


String a = "aaabbb"; String b = a.replaceAll("(.)\\1+", "$1"); System.out.println("'" + a + "' -> '" + b + "'"); 
like image 30
Jorge Ferreira Avatar answered Oct 10 '22 05:10

Jorge Ferreira