Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegExp match repeated characters

For example I have string:

 aacbbbqq 

As the result I want to have following matches:

 (aa, c, bbb, qq)   

I know that I can write something like this:

 ([a]+)|([b]+)|([c]+)|...   

But I think i's ugly and looking for better solution. I'm looking for regular expression solution, not self-written finite-state machines.

like image 604
Andrew Avatar asked Jun 10 '11 12:06

Andrew


People also ask

How do I find repetitions in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.

What is ?! In regex?

The ?! n quantifier matches any string that is not followed by a specific string n.

What does * do in regex?

The Match-zero-or-more Operator ( * ) This operator repeats the smallest possible preceding regular expression as many times as necessary (including zero) to match the pattern. `*' represents this operator. For example, `o*' matches any string made up of zero or more `o' s.


2 Answers

You can match that with: (\w)\1*

like image 94
Qtax Avatar answered Sep 22 '22 05:09

Qtax


itertools.groupby is not a RexExp, but it's not self-written either. :-) A quote from python docs:

# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D 
like image 39
DrTyrsa Avatar answered Sep 22 '22 05:09

DrTyrsa