Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What regex can match sequences of the same character?

Tags:

regex

perl

A friend asked me this and I was stumped: Is there a way to craft a regular expression that matches a sequence of the same character? E.g., match on 'aaa', 'bbb', but not 'abc'?

m|\w{2,3}|  

Wouldn't do the trick as it would match 'abc'.

m|a{2,3}|  

Wouldn't do the trick as it wouldn't match 'bbb', 'ccc', etc.

like image 520
Bill Avatar asked Mar 13 '09 21:03

Bill


People also ask

What regex matches any character?

Matching a Single Character Using Regex By default, the '. ' dot character in a regular expression matches a single character without regard to what character it is. The matched character can be an alphabet, a number or, any special character.

What does regex (? S match?

i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.

Which of the following matches a single character regex?

Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.


2 Answers

Sure thing! Grouping and references are your friends:

(.)\1+ 

Will match 2 or more occurences of the same character. For word constituent characters only, use \w instead of ., i.e.:

(\w)\1+ 
like image 119
David Hanak Avatar answered Sep 23 '22 19:09

David Hanak


Note that in Perl 5.10 we have alternative notations for backreferences as well.

foreach (qw(aaa bbb abc)) {   say;   say ' original' if /(\w)\1+/;   say ' new way'  if /(\w)\g{1}+/;   say ' relative' if /(\w)\g{-1}+/;   say ' named'    if /(?'char'\w)\g{char}+/;   say ' named'    if /(?<char>\w)\k<char>+/; } 
like image 39
oylenshpeegul Avatar answered Sep 19 '22 19:09

oylenshpeegul