Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex patterns in Apache RewriteCond and friends: full or partial match?

Apache documentation states that

CondPattern is the condition pattern, a regular expression which is applied to the current instance of the TestString [...] CondPattern is a perl compatible regular expression with some additions [...]

And gives such examples as

RewriteCond %{REMOTE_HOST}  ^host1.*  [OR]

I know about Perl regular expressions, but this seems confusing to me:

Are we speaking here of a partial match? (The pattern matches some substring inside the string, as in Perl $string =~ m/pattern/;?)

Or is rather it a full match? (The pattern matches the entire string, as in Java Pattern.matches(pattern, string)?)

If the match is partial, the trailing .* seems redundant. It it's full, then it's the ^ what seems redundant.

Many examples I've found, in the Apache docs and everywhere, have this (apparent to me) inconsistency.

Update: This was an issue with the docs, it's corrected now.

like image 938
leonbloy Avatar asked Apr 07 '11 20:04

leonbloy


2 Answers

Partial. (The .* is redundant in the example.)

Apache uses the excellent PCRE library (although an older version - Rev: 5.0 2004 last time I checked), written by Phillip Hazel, which is very much Perl-like. But I agree, many of their examples leave much to be desired with regard to regex precision.

There are a few Apache-specific extensions, e.g. adding a leading ! to the pattern to apply NOT logic to the match. The best documentation I've found (which you've probably already seen), is this page: Apache Module mod_rewrite

like image 73
ridgerunner Avatar answered Oct 25 '22 16:10

ridgerunner


Thanks. I've removed the unnecessary .* from the regular expressions I believe you were referring to in the docs. Please let me know if you find others, and I'll fix those, too.

Regular expressions are substring matches, by default, and don't need to match the entire string (usually). Places in the Apache docs that imply otherwise are errors, and need to be fixed.

People tend to throw .* into regular expressions all the time when they are completely unnecessary.

like image 41
Rich Bowen Avatar answered Oct 25 '22 17:10

Rich Bowen