Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use apache2 mod_rewrite within a Directory directive that uses wildcards?

I have written a web application which I run under a dedicated server for hosting the web application. Instances of this web application are available at different domains, and each domain has its own copy of the web application files, allowing for customization as necessary.

I'm running Apache/2.2.16 under Debian Squeeze.

I do all of the configuration under a VirtualHost directive and do not use .htaccess files.

To simplify the apache configuration, I am wanting to maintain a single Directory directive like such:

<Directory "/srv/www/*/public/">
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteCond %{REQUEST_URI} !=/robots.txt
  RewriteRule ^(.+)$ /index.php?q=$1 [L,QSA]
</Directory>

However, the RewriteRule produces the wrong results because while using the wildcard Directory value, it fails to strip the per-directory prefix. Here is the output of the rewrite log:

[rid#b9832078/initial] (3) [perdir /srv/www/*/public/] applying pattern '^(.+)$' to uri '/srv/www/domain1/public/login'
[rid#b9832078/initial] (4) [perdir /srv/www/*/public/] RewriteCond: input='/srv/www/domain1/public/login' pattern='!-f' => matched
[rid#b9832078/initial] (4) [perdir /srv/www/*/public/] RewriteCond: input='/srv/www/domain1/public/login' pattern='!-d' => matched
[rid#b9832078/initial] (4) [perdir /srv/www/*/public/] RewriteCond: input='/login' pattern='!=/favicon.ico' => matched
[rid#b9832078/initial] (4) [perdir /srv/www/*/public/] RewriteCond: input='/login' pattern='!=/robots.txt' => matched
[rid#b9832078/initial] (2) [perdir /srv/www/*/public/] rewrite '/srv/www/domain1/public/login' -> '/index.php?q=/srv/www/domain1/public/login'
[rid#b9832078/initial] (3) split uri=/index.php?q=/srv/www/domain1/public/login -> uri=/index.php, args=q=/srv/www/domain1/public/login
[rid#b9832078/initial] (1) [perdir /srv/www/*/public/] internal redirect with /index.php [INTERNAL REDIRECT]
[rid#b9847440/initial/redir#1] (3) [perdir /srv/www/*/public/] applying pattern '^(.+)$' to uri '/srv/www/domain1/public/index.php'
[rid#b9847440/initial/redir#1] (4) [perdir /srv/www/*/public/] RewriteCond: input='/srv/www/domain1/public/index.php' pattern='!-f' => not-matched
[rid#b9847440/initial/redir#1] (1) [perdir /srv/www/*/public/] pass through /srv/www/domain1/public/index.php

The problem is that the RewriteRule 'uri' is the filesystem path rather than the url path, which results in the query string being incorrect: q=/srv/www/domain1/public/login

Explicitly specifying the Directory path like such:

<Directory "/srv/www/domain1/public/">
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteCond %{REQUEST_URI} !=/robots.txt
  RewriteRule ^(.+)$ /index.php?q=$1 [L,QSA]
</Directory>

Works just fine, and here is the output of the rewrite log showing the correct behavior (the difference being the new first additional line providing the correct input to the rest of the rewrite resulting in the correct query string: q=login):

[rid#b9868048/initial] (3) [perdir /srv/www/domain1/public/] strip per-dir prefix: /srv/www/domain1/public/login -> login
[rid#b9868048/initial] (3) [perdir /srv/www/domain1/public/] applying pattern '^(.+)$' to uri 'login'
[rid#b9868048/initial] (4) [perdir /srv/www/domain1/public/] RewriteCond: input='/srv/www/domain1/public/login' pattern='!-f' => matched
[rid#b9868048/initial] (4) [perdir /srv/www/domain1/public/] RewriteCond: input='/srv/www/domain1/public/login' pattern='!-d' => matched
[rid#b9868048/initial] (4) [perdir /srv/www/domain1/public/] RewriteCond: input='/login' pattern='!=/favicon.ico' => matched
[rid#b9868048/initial] (4) [perdir /srv/www/domain1/public/] RewriteCond: input='/login' pattern='!=/robots.txt' => matched
[rid#b9868048/initial] (2) [perdir /srv/www/domain1/public/] rewrite 'login' -> '/index.php?q=login'
[rid#b9868048/initial] (3) split uri=/index.php?q=login -> uri=/index.php, args=q=login
[rid#b9868048/initial] (1) [perdir /srv/www/domain1/public/] internal redirect with /index.php [INTERNAL REDIRECT]
[rid#b987d5f8/initial/redir#1] (3) [perdir /srv/www/domain1/public/] strip per-dir prefix: /srv/www/domain1/public/index.php -> index.php
[rid#b987d5f8/initial/redir#1] (3) [perdir /srv/www/domain1/public/] applying pattern '^(.+)$' to uri 'index.php'
[rid#b987d5f8/initial/redir#1] (4) [perdir /srv/www/domain1/public/] RewriteCond: input='/srv/www/domain1/public/index.php' pattern='!-f' => not-matched
[rid#b987d5f8/initial/redir#1] (1) [perdir /srv/www/domain1/public/] pass through /srv/www/domain1/public/index.php

I expect I'm running into a bug with Apache, but if that isn't the case, what am I doing wrong?

While I appreciate input to changing the approach to another workable solution, I'd accept an answer that solves it in the approach I've taken (eg not using .htaccess) unless it can be shown this approach is not solvable.

So is there something that has to change to the RewriteCond/Rules when used within a wildcard Directory?

Side note for the curious: For further simplification I use a single VirtualHost using VirtualDocumentRoot - however this is unrelated as this issue is replicated with using 'DocumentRoot' and testing under a single domain.

EDIT

Ok, I've revisited this based on regilero's answer and here is what occurs - moving the Rewrite, as is, out of the Directory results in a slight initial problem of the query string changing from "login" to "/login", this is fixed by modifying the RewriteRule to be: RewriteRule ^/(.+)$ /index.php?q=$1 [L,QSA] which fixes my previous "inexplicably fails" comment.

Following that, all static files fail to load, here is the rewrite log showing this problem:

[rid#b7bc7fa0/initial] (2) init rewrite engine with requested uri /login
[rid#b7bc7fa0/initial] (3) applying pattern '^/(.+)$' to uri '/login'
[rid#b7bc7fa0/initial] (4) RewriteCond: input='/login' pattern='!-f' => matched
[rid#b7bc7fa0/initial] (4) RewriteCond: input='/login' pattern='!-d' => matched
[rid#b7bc7fa0/initial] (4) RewriteCond: input='/login' pattern='!=/favicon.ico' => matched
[rid#b7bc7fa0/initial] (4) RewriteCond: input='/login' pattern='!=/robots.txt' => matched
[rid#b7bc7fa0/initial] (2) rewrite '/login' -> '/index.php?q=login'
[rid#b7bc7fa0/initial] (3) split uri=/index.php?q=login -> uri=/index.php, args=q=login
[rid#b7bc7fa0/initial] (2) local path result: /index.php
[rid#b7bc7fa0/initial] (2) prefixed with document_root to /srv/www/domain1/public/index.php
[rid#b7bc7fa0/initial] (1) go-ahead with /srv/www/domain1/public/index.php [OK]
[rid#b7be6b80/initial] (2) init rewrite engine with requested uri /static/css/common.css
[rid#b7be6b80/initial] (3) applying pattern '^/(.+)$' to uri '/static/css/common.css'
[rid#b7be6b80/initial] (4) RewriteCond: input='/static/css/common.css' pattern='!-f' => matched
[rid#b7be6b80/initial] (4) RewriteCond: input='/static/css/common.css' pattern='!-d' => matched
[rid#b7be6b80/initial] (4) RewriteCond: input='/static/css/common.css' pattern='!=/favicon.ico' => matched
[rid#b7be6b80/initial] (4) RewriteCond: input='/static/css/common.css' pattern='!=/robots.txt' => matched
[rid#b7be6b80/initial] (2) rewrite '/static/css/common.css' -> '/index.php?q=static/css/common.css'
[rid#b7be6b80/initial] (3) split uri=/index.php?q=static/css/common.css -> uri=/index.php, args=q=static/css/common.css
[rid#b7be6b80/initial] (2) local path result: /index.php
[rid#b7be6b80/initial] (2) prefixed with document_root to /srv/www/domain1/public/index.php
[rid#b7be6b80/initial] (1) go-ahead with /srv/www/domain1/public/index.php [OK]

But like I said in my comment to regilero's answer, this is solved by prefixing the RewriteCond directives TestString with %{DOCUMENT_ROOT}. However, using %{DOCUMENT_ROOT} does not work when using VirtualDocumentRoot.

It does not seem right to me that the %{DOCUMENT_ROOT} prefix should be necessary.

EDIT

REQUEST_FILENAME

The full local filesystem path to the file or script matching the request, if this has already been determined by the server at the time REQUEST_FILENAME is referenced. Otherwise, such as when used in virtual host context, the same value as REQUEST_URI.

which explains the need for the DOCUMENT_ROOT prefix.

I've updated the rewrite rules to this:

RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteCond %{REQUEST_URI} !=/robots.txt
RewriteCond %{REQUEST_URI} !^/static/
RewriteRule ^/(.+)$ /index.php?q=$1 [PT,L,QSA]

Which works ok (Note: the PT flag is necessary to avoid prematurely translating the url path to a file system path when using VirutalDocumentRoot). The main change in behavior here is that a RewriteCond will be necessary for all entry points into the application - similar to the /static line.

EDIT

Here is my final incarnation of Rewrite directives in the VirtualHost outside of any Directory directives:

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/static/
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteCond %{REQUEST_URI} !=/robots.txt
RewriteRule ^/(.+)$ /index.php?q=$1 [NS,PT,L,QSA]
RewriteRule ^/$ /index.php [NS,PT,L,QSA]

I've added the NS flag to avoid an extra internal evaluation and added the second RewriteRule directive in favor of using mod_dir and DirectoryIndex. My application expects no q= parameter for the root url, else a single RewriteRule of RewriteRule ^/(.*)$ /index.php?q=$1 [NS,PT,L,QSA] would be sufficient if the application was updated to accept an empty q= parameter for the root url. I may do that in the future.

like image 923
chris Avatar asked Jun 15 '11 14:06

chris


People also ask

How does Apache mod_rewrite work?

mod_rewrite works through the rules one at a time, processing any rules that match the requested URL. If a rule rewrites the requested URL to a new URL, that new URL is then used from that point onward in the . htaccess file, and might be matched by another RewriteRule further down the file.

What is $1 rewrite rule?

Yes. In the RewriteCond , it's basically saying that it'll run the rewrite as long as $1 doesn't equal on of the files listed to the right of the condition.

What does rewrite rule do?

Redirecting requests htaccess rewrite rules can be used to direct requests for one subdirectory to a different location, such as an alternative subdirectory or even the domain root.

What is RewriteCond %{ Request_filename?

RewriteCond %{REQUEST_FILENAME} !-f. RewriteCond %{REQUEST_FILENAME} !-d. … means that if the file with the specified name in the browser doesn't exist, or the directory in the browser doesn't exist then procede to the rewrite rule below.


1 Answers

Very nice and detailled question.

You have quite certainly hit a bug, or at least an undocumented rewriteRule domain. Documentation states that:

  • The rewrite engine may be used in .htaccess files and in sections, with some additional complexity.
  • To enable the rewrite engine in this context, you need to set "RewriteEngine On" and "Options FollowSymLinks" must be enabled. If your administrator has disabled override of FollowSymLinks for a user's directory, then you cannot use the rewrite engine. This restriction is required for security reasons.
  • When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the RewriteRule pattern matching and automatically added after any relative (not starting with a slash or protocol name) substitution encounters the end of a rule set. See the RewriteBase directive for more information regarding what prefix will be added back to relative substutions.

So no mention of the fact <Directory> instruction with wildcards won't be able to strip the per-directory prefix. And playing with RewriteBase won't help you, it's done to rebuild final Url not alter the perdir work.

But as you can see on the start there's the "with some additional complexity" sentence. Directory manipulations done by mod-rewrite are slower and more complex than general out-of-directory RewriteRules. This is stated as well in this documentation, mainly because of the perdir strip manipulation. And this means you can also write your rewriteRule out of the <Directory> section, in your VirtualHost.

  • it will be faster
  • it will not be hit by this bug
  • it may have some side effects if some non-existing files should'nt be mapped to your index.php?q=$1 rule in some other directories. But I'm quite sure this is not a problem in your case.

So simply write (without the wildcard directory):

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteCond %{REQUEST_URI} !=/robots.txt
RewriteRule ^(.+)$ /index.php?q=$1 [L,QSA]

And it should work, let me known if this leads to new problems.

Edit:

Ok, forogot the fact REQUEST_FILENAME is not yet complelty defined in VirtualHost context, it's documented, it's 'normal', when the condition is applied the file search on the real path is not done yet, this is why you must add the document root. So in fact your final solution should be :

RewriteEngine on
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteCond %{REQUEST_URI} !=/robots.txt
RewriteRule ^/(.+)$ /index.php?q=$1 [L,QSA]

I tried a second one, avoiding DOCUMENT_ROOT, by using late evaluation of REQUEST_FILENAME ( %{LA-U:REQUEST_FILENAME} contains the final path, which is in fact the full path to index.php in case of non existent files), but the only way I got it working is by adding a second Rule and a Or condition in the second, less simple, so the first solution is certainly better (KISS).

  RewriteCond %{LA-U:REQUEST_FILENAME} !-f [OR]
  RewriteCond %{LA-U:REQUEST_FILENAME} !/index.php
  RewriteCond %{LA-U:REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteCond %{REQUEST_URI} !=/robots.txt
  RewriteRule ^/(.+)$ /index.php?q=$1 [L,QSA]

  RewriteCond %{LA-U:REQUEST_FILENAME} /index.php
  RewriteRule ^/(.+)$ /index.php?q=$1 [L,QSA]
like image 68
regilero Avatar answered Oct 22 '22 06:10

regilero