Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to find any character used more than 3 times in a string but not consecutively

Tags:

regex

I found all sorts of really close answers already, but not quite.

I need to look at a string, and find any character that is used more than 3 times. Basically to limit a password to disallow "mississippi" as it has more than 3 s's in it. I think it only needs to be characters, but should be unicode. So I guess the (:alpha:) for the character set to match on.

I found (\w)\1+{4,} which finds consecutive characters, like ssss or missssippi but not if they are not consecutive.

Working my way through the other regex questions to see if someone has answered it but there are lots, and no joy yet.

like image 734
geoffc Avatar asked Jan 22 '23 22:01

geoffc


2 Answers

This should do it:

/(.)(.*\1){3}/

It doesn't make any sense to try to combine this with checking for allowable characters. You should first test that all characters are allowable characters and then run this test afterwards. This is why it's OK to use '.' here.

It will be slow though. It would be faster to iterate once over the string and count the characters. Although for your purpose I doubt it makes much difference since the strings are so short.

like image 71
Mark Byers Avatar answered Jan 25 '23 10:01

Mark Byers


(\w)(.*\1){2,}

Match a "word character", then 2 copies of "anything, then the first thing again". Thus 3 copies of the first thing, with anything in between.

like image 23
ephemient Avatar answered Jan 25 '23 11:01

ephemient