A friend asked me this and I was stumped: Is there a way to craft a regular expression that matches a sequence of the same character? E.g., match on 'aaa', 'bbb', but not 'abc'?
m|\w{2,3}|
Wouldn't do the trick as it would match 'abc'.
m|a{2,3}|
Wouldn't do the trick as it wouldn't match 'bbb', 'ccc', etc.
Matching a Single Character Using Regex By default, the '. ' dot character in a regular expression matches a single character without regard to what character it is. The matched character can be an alphabet, a number or, any special character.
i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.
Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.
Sure thing! Grouping and references are your friends:
(.)\1+
Will match 2 or more occurences of the same character. For word constituent characters only, use \w
instead of .
, i.e.:
(\w)\1+
Note that in Perl 5.10 we have alternative notations for backreferences as well.
foreach (qw(aaa bbb abc)) { say; say ' original' if /(\w)\1+/; say ' new way' if /(\w)\g{1}+/; say ' relative' if /(\w)\g{-1}+/; say ' named' if /(?'char'\w)\g{char}+/; say ' named' if /(?<char>\w)\k<char>+/; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With