Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to match any character being repeated more than 10 times

Tags:

regex

People also ask

What regex syntax is used for matching any number of repetitions?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.

Which regular expression do you use to match one or more of the preceding characters?

The character + in a regular expression means "match the preceding character one or more times". For example A+ matches one or more of character A. The plus character, used in a regular expression, is called a Kleene plus .

What can be matched using (*) in a regular expression?

You can repeat expressions with an asterisk or plus sign. A regular expression followed by an asterisk ( * ) matches zero or more occurrences of the regular expression. If there is any choice, the first matching string in a line is used.

What is '?' In regular expression?

It means "Match zero or one of the group preceding this question mark." It can also be interpreted as the part preceding the question mark is optional. In above example '?' indicates that the two digits preceding it are optional. They may not occur or occur at the most once.


The regex you need is /(.)\1{9,}/.

Test:

#!perl
use warnings;
use strict;
my $regex = qr/(.)\1{9,}/;
print "NO" if "abcdefghijklmno" =~ $regex;
print "YES" if "------------------------" =~ $regex;
print "YES" if "========================" =~ $regex;

Here the \1 is called a backreference. It references what is captured by the dot . between the brackets (.) and then the {9,} asks for nine or more of the same character. Thus this matches ten or more of any single character.

Although the above test script is in Perl, this is very standard regex syntax and should work in any language. In some variants you might need to use more backslashes, e.g. Emacs would make you write \(.\)\1\{9,\} here.

If a whole string should consist of 9 or more identical characters, add anchors around the pattern:

my $regex = qr/^(.)\1{9,}$/;

In Python you can use (.)\1{9,}

  • (.) makes group from one char (any char)
  • \1{9,} matches nine or more characters from 1st group

example:

txt = """1. aaaaaaaaaaaaaaa
2. bb
3. cccccccccccccccccccc
4. dd
5. eeeeeeeeeeee"""
rx = re.compile(r'(.)\1{9,}')
lines = txt.split('\n')
for line in lines:
    rxx = rx.search(line)
    if rxx:
        print line

Output:

1. aaaaaaaaaaaaaaa
3. cccccccccccccccccccc
5. eeeeeeeeeeee

. matches any character. Used in conjunction with the curly braces already mentioned:

$: cat > test
========
============================
oo
ooooooooooooooooooooooo


$: grep -E '(.)\1{10}' test
============================
ooooooooooooooooooooooo

On some apps you need to remove the slashes to make it work.

/(.)\1{9,}/

or this:

(.)\1{9,}

={10,}

matches = that is repeated 10 or more times.


use the {10,} operator:

$: cat > testre
============================
==
==============

$: grep -E '={10,}' testre
============================
==============