I have a bunch of rules in my .htaccess (sub-domains, folders, users specific folders etc...)
and I am using now this regular expression:
([a-z0-9A-Z])
I was looking for a specific rule and i found multiple way to build it and i was wondering if there's a standard practice for these? what are the difference/pros/cons of using something like:
([^.]+)
([^/]+)
(.*)
([a-z0-9]+)
Let's say we have this .htaccess:
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?request=$1 [L]
The expression mentionned in your question will have the following logic:
^(.*)$
.
: match any character and any single character*
: match zero or more of the previous symbolBasically it will match anything like:
folder1/file1.html
: $1 will folder1/file1.html
file1.html
: $1 will be file1.html
This way it is very easy to parse the entire request in PHP or Python. On the other hand, you don't filter any unwanted characters in the URL which you will have to validate in your script.
Example:
=@*-+
([^.]+)
[]
: match any of the symbols inside the square braces[^]
: match any character other than what is listed inside the braces (ref). +
: match one or more of the previous symbol[^.]
: match anything other than .
character. Will stop matching when a .
character is foundFrom ref.
The only special characters or metacharacters inside a character class are the closing bracket (]), the backslash (), the caret (^) and the hyphen (-). The usual metacharacters are normal characters inside a character class, and do not need to be escaped by a backslash. To search for a star or plus, use [+*]. Your regex will work fine if you escape the regular metacharacters inside a character class, but doing so significantly reduces readability.
Basically it will match anything like:
folder1/file1.html
: $1 will folder1/file1
file1.html
: $1 will be file1
This as the same effect as the first one except this strip everything after the dot .
^([^/]+)$
[]
: match any of the symbols inside the square braces+
: match one or more of the previous symbol^
: match the start of a string[^/]
: match anything other than /
character. Will stop matching when a /
character is foundThis as the same effect as the first one except this will check any request up to the /
. So if you have multiple folders you will have to include multiple times this regex.
Basically it will match anything like (if you have only one set):
folder1/file1.html
: $1 will folder1
file1.html
: $1 will be file1.html
and if you have 2:
folder1/file1.html
: $1 will folder1
and $2 will match file1.html
file1.html
: $1 will be file1.html
The more folders you have, the more rule you might have to add.
^([a-z0-9]+)$ [ ^([a-z0-9.]+)$ for this example ]
[]
: match any of the symbols inside the square braces+
: match one or more of the previous symbola-z
: match letters from a to z0-9
: match numbers from 0-9(You can also use the \d or \w)
Basically it will match anything like (if you have only one set - added the dot):
folder1/file1.html
: $1 will folder1
file1.html
: $1 will be file1.html
and if you have 2:
folder1/file1.html
: $1 will folder1
and $2 will match file1.html
file1.html
: $1 will be file1.html
This one works like the previous except you have to specify which characters you want. Therefore, when you check your string in PHP you know which characters you get.
Like in my example with the file name I had to add the \.
so it recognise the dot. This one is also faster to execute.
See the benchmark: .htaccess mod_rewrite performance
So, if you know what type of request you will get you can always use the the last one but if you are not sure, you will have to pick the one that suits more your need. There's might be more difference between all of them but the primary objective understanding these regular expression is to understand what they do or catch. In addition, performance is something you need to take in consideration. Matching everything then parsing the request in PHP or Python might take longer than simply match them at first and simply use them in your script.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With