Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match repeated characters

Tags:

regex

go

I am trying to create a regex that matches a string if it has a 3 or more repetitive characters in a row (e.g. aaaaaa, testtttttt, otttttter).

I have tried the following:

regexp.Compile("[A-Za-z0-9]{3,}")
regexp.Compile("(.){3,}")
regexp.Compile("(.)\\1{3,}")

which matches any 3 characters in a row, but not consecutive characters... Where am I going wrong?

like image 627
p_mcp Avatar asked Mar 02 '16 00:03

p_mcp


People also ask

How do you repeat in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.

How do you determine if a string contains a sequence of repeated letters?

push(char[i]); } else { tempArry[char[i]] = []; tempArry[char[i]]. push(char[i]); } } console. log(tempArry); This will even return the number of repeated characters also.

Why * is used in regex?

- a "dot" indicates any character. * - means "0 or more instances of the preceding regex token"

What is Dot Plus in regex?

The next token is the dot, which matches any character except newlines. The dot is repeated by the plus. The plus is greedy. Therefore, the engine will repeat the dot as many times as it can. The dot matches E, so the regex continues to try to match the dot with the next character.


1 Answers

What you're asking for cannot be done with true regular expressions, what you need are (irregular) backreferences. While many regexp engines implement them, RE2 used by Go does not. RE2 is a fast regexp engine that guarantees linear time string processing, but there's no known way to implement backreferences with such efficiency. (See https://swtch.com/~rsc/regexp/ for further information.)

To solve your problem you may want to search for some other regexp library. I believe bindings for PCRE can be found, but I've no personal experience from them.

Another approach would be to parse the string manually without using (ir)regular expressions.

like image 90
LemurFromTheId Avatar answered Oct 12 '22 05:10

LemurFromTheId