Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What do the symbols mean in preg_match?

Tags:

php

preg-match

I have this expression in a code snippet i borrowed offline. It forces the new users to have a password that not only requires upper+lower+numbers but they must be in that order! If i enter lower+upper+numbers, it fails!

if (preg_match("/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/", $pw_clean, $matches)) {

Ive searched online but can't find a resource that tells me what some characters mean. I can see that the pattern is preg_match("/some expression/",yourstring,your match).

What do these mean:

1.  ^          -  ???
2.  .*         -  ???
3.  (?=.{4,})  -  requires 4 characters minimum
4.  (?.*[0-9]) -  requires it to have numbers
5.  (?=.*[a-z])-  requires it to have lowercase
6.  (?=.*[A-Z])-  requires it to have uppercase
7.  .*$        -  ???
like image 354
marciokoko Avatar asked Apr 18 '12 00:04

marciokoko


People also ask

What does () mean in Preg_match?

Built-in Regular expression Functions in PHP preg_match() in PHP – this function is used to perform pattern matching in PHP on a string. It returns true if a match is found and false if a match is not found.

What is the purpose of Preg_match () regular expression in PHP?

The preg_match() function will tell you whether a string contains matches of a pattern.

What value is return by Preg_match?

Return Values ¶ preg_match() returns 1 if the pattern matches given subject , 0 if it does not, or false on failure. This function may return Boolean false , but may also return a non-Boolean value which evaluates to false .

What does Preg_match_all return?

The preg_match_all() function returns the number of matches of a pattern that were found in a string and populates a variable with the matches that were found.


1 Answers

Here are the direct answers. I kept them short because they won't make sense without an understanding of regex. That understanding is best gained at regular-expressions.info. I advise you to also try out the regex helper tools listed there, they allow you to experiment - see live capturing/matching as you edit the pattern, very helpful.


1: The caret ^ is an anchor, it means "the start of the haystack/string/line".

  • If a caret is the first symbol inside a character class [], it has a different meaning: It negates the class. (So in [^ab] the caret makes that class match anything which is not ab)

2: The dot . and the asterisk * serve two separate purposes:

  • The dot matches any single character except newline \n.
  • The asterisk says "allow zero or many of the preceeding type".

When these two are combined as .* it basically reads "zero or more of anything until a newline or another rule comes into effect".

7: The dollar $ is also an anchor like the caret, with the opposite function: "the end of the haystack".


Edit:

Simple parentheses ( ) around something makes it a group. Here you have (?=) which is an assertion, specifically a positive look ahead assertion. All it does is check whether what's inside actually exists forward from the current cursor position in the haystack. Still with me?
Example: foo(?=bar) matches foo only if followed by bar. bar is never matched, only foo is returned.

With this in mind, let's dissect your regex:

/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/

Reads as:
        ^.* From Start, capture 0-many of any character
  (?=.{4,}) if there are at least 4 of anything following this
(?=.*[0-9]) if there is: 0-many of any, ending with an integer following
(?=.*[a-z]) if there is: 0-many of any, ending with a lowercase letter following
(?=.*[A-Z]) if there is: 0-many of any, ending with an uppercase letter following
        .*$ 0-many of anything preceding the End

You say the order of password characters matter - it doesn't in my tests. See test script below. Hope this cleared up a thing or two. If you are looking for another regex which is a bit more forgiving, see regex password validation

<pre>
<?php
// Only the last 3 fail, as they should. You claim the first does not work?
$subjects = array("aaB1", "Baa1", "1Baa", "1aaB", "aa1B", "aa11", "aaBB", "aB1");

foreach($subjects as $s)
{
    $res = preg_match("/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/", $s, $matches);
    echo "result: ";
    print_r($res);

    echo "<br>";
    print_r($matches);
    echo "<hr>";
}

Excellent online tool for checking and testing Regular Expressions: https://regex101.com/

like image 95
ccondrup Avatar answered Sep 22 '22 06:09

ccondrup