Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

match the same unknown character multiple times

Tags:

regex

I have a regex problem I can't seem to solve. I actually don't know if regex can do this, but I need to match a range of characters n times at the end of a pattern. eg. blahblah[A-Z]{n} The problem is whatever character matches the ending range need to be all the same.

For example, I want to match

  • blahblahAAAAA
  • blahblahEEEEE
  • blahblahQQQQQ

but not

  • blahblahADFES
  • blahblahZYYYY

Is there some regex pattern that can do this?

like image 256
Erin Aarested Avatar asked Jul 12 '12 20:07

Erin Aarested


People also ask

How do you repeat in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.

What is Dot Plus in regex?

The next token is the dot, which matches any character except newlines. The dot is repeated by the plus. The plus is greedy. Therefore, the engine will repeat the dot as many times as it can. The dot matches E, so the regex continues to try to match the dot with the next character.

Why * is used in regex?

* - means "0 or more instances of the preceding regex token"


2 Answers

You can use this pattern: blahblah([A-Z])\1+

The \1 is a back-reference to the first capture group, in this case ([A-Z]). And the + will match that character one or more times. To limit it you can replace the + with a specific number of repetitions using {n}, such as \1{3} which will match it three times.

If you need the entire string to match then be sure to prefix with ^ and end with $, respectively, so that the pattern becomes ^blahblah([A-Z])\1+$

You can read more about back-references here.

like image 192
Ahmad Mageed Avatar answered Oct 12 '22 15:10

Ahmad Mageed


In most regex implementations, you can accomplish this by referencing a capture group in your regex. For your example, you can use the following to match the same uppercase character five times:

blahblah([A-Z])\1{4}

Note that to match the regex n times, you need to use \1{n-1} since one match will come from the capture group.

like image 28
Andrew Clark Avatar answered Oct 12 '22 13:10

Andrew Clark