I'm pretty new to Regex, only just started learning it in school. I got my first assignment and am getting through it fairly well.
Let me explain so my code makes sense...
The assignment is making my .NET Regex Tester search through a text for passwords.
These passwords can not contain any whitespaces (so I used \S
) Can't start with a number or Underscore, so I used (?m:^([^_|^0-9]{1})
Can't end on two different characters
(?<finalTwo>(?i:\S{1}))(?i:\<finalTwo>)
Have to contain at least one digit, so I used a lookahead. Now, the thing here is, the code is a pretty cluttered thing right now.
(?=.*\d)(?m:^([^_|^0-9]{1})(\S*)(?<finalTwo>(?i:\S{1}))(?i:\<finalTwo>))
And I have to add one more thing, the password has to be between 8 and 20 characters long.
I know I have to use {8,20}
(I think), but the thing is, no matter where I enter this, it completely kills the search.
Does anybody have an idea how I can solve this?
Much appreciated.
[Disclaimer, that's a pretty long answer!]
I will begin with the character limit.
You'll have to use (?<!\S)
and (?!\S)
to indicate the beginning and end of the password and use \S{8,20}
for the actual password:
(?m)(?<!\S)\S{8,20}(?!\S)
As you probably already know (?m)
is for multiline (^
and $
matches beginning and end of line respectively instead of the string in this mode).
(?<!\S)
makes sure there's no non-whitespace character before the password.
(?!\S)
makes sure there's no non-whitespace character after the password.
Now we add some restrictions:
Cannot begin with number or underscore: (?![0-9_])
a negative lookahead at the start of the password:
(?m)(?<!\S)(?![0-9_])\S{8,20}(?!\S)
Must contain at least one digit: (?=\S+[0-9])
a positive lookahead at the start of the password:
(?m)(?<!\S)(?![0-9_])(?=\S+[0-9])\S{8,20}(?!\S)
Must end with the same characters: You'll have to capture the second to last character and use a backreference. You can change the \S{8,20}
part to \S{6,18}(\S)\1
for this:
(?m)(?<!\S)(?![0-9_])(?=\S+[0-9])\S{6,18}(\S)\1(?!\S)
Now that should be good.
To your regex now:
(?m:^([^_|^0-9]{1})
First, the {1} is redundant because if you remove it, it wouldn't change anything at all.
(?m:^([^_|^0-9])
Second, you have unbalanced parentheses. Not sure what's that supposed to be, but I guess the first paren wasn't intended.
(?m:^[^_|^0-9])
Next, the character class [^_|^0-9]
matches any character except _
, |
, ^
or the range 0-9
. I'm sure that a password can begin with |
or ^
. The metacharacter |
loses its meaning in a character class! You could use this: [^_0-9]
instead and this would become:
(?m:^[^_0-9])
It's okay to use this, but you will have to keep in mind that this is the first character in the password; for you have a range of 8 to 20 characters to respect, and it just changed to 7,19. The only thing left with it is that it also accepts a space. You can put one in the character class to avoid this:
(?m:^[^_0-9 ])
Okay, looks better now, next one:
(?<finalTwo>(?i:\S{1}))(?i:\<finalTwo>)
First is a named capture group, okay, with a non-capture group with case insensitivity mode on (not quite necessary since we don't have any alphabets in the regex) and \S{1}
inside that non-capture group. Once again, the {1}
is redundant. Removing it and the (?i)
mode, this becomes:
(?<finalTwo>\S)(?:\<finalTwo>)
That's not so bad, if it matches the last two character, it'll indeed work.
(?=.*\d)
Works well. You might want to lookout for the characters other than 0-9
that \d
matches, but if you don't mind, that works almost; it'd be better to use \S
instead of .
here just in case there are two passwords separated by a space next to each other in the text and this might make things go not like you intended.
(\S*)
That part's more or less okay. There's just no limit imposed.
(?=\S*\d)(?m:^[^_0-9 ])(\S*)(?<finalTwo>\S)(?:\<finalTwo>)
Okay, now, remember that (?m:^[^_0-9])
took one character, and (?<finalTwo>\S)(?:\<finalTwo>)
takes two characters, for a total of 3. The limit you thus impose is:
(?=\S*\d)(?m:^[^_0-9 ])(\S{5,17})(?<finalTwo>\S)(?:\<finalTwo>)
It almost works, and you only need to put something to prevent partial match of longer passwords. You can usually use word boundaries \b
but nothing has been mentioned about symbols, so it's safer to assume that a password like $@4*&AUn++
is also allowed and that's where the word boundaries will fail. That's why I suggest the use of the negative lookarounds.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With